Open aboba opened 4 years ago
Hi Bernard, thanks for keeping us up-to-date!
We have not been able to upgrade to the latest version of the Google code yet, but are working on it. That Insertable Streams API looks a priori extremely promising, if it does what I think it does, and if so then I am pretty sure we want to integrate it ASAP. But the upgrade to Google's latest code (on-going on the experimental/undock
branch) precludes that; we are now waiting on microsoft/winrtc to provide parity on video capture with what we have currently with webrtc-uwp-sdk, which is scheduled to happen soon. Until then we cannot switch to that branch, as we have no video on UWP, so that would be a regression compared to our current master
state.
@djee-ms One of the goals of the Insertable Streams Origin Trial is to gather feedback on the API from developers. @alvestrand had a question about how much meta-data might need to be inserted within the "Insertable Stream". This might affect the interaction with congestion control. Are we talking about ~10 B of meta-data or 1 KB?
We didn't investigate a lot, but my intuition is much closer to 1 KB than 10 B; for 10 B you can already make some fake RTP header extension, but the biggest problem is the size limit. Typical use is passing a camera matrix associated with the head pose, so at best 3-4 floats (12-16 B), and that's already too big I think for 1 RTP header extension, you need 2. And that's really the bare minimum info needed, I expect developers to make good use of more space.
It would be interesting to know the threshold at which the interaction with congestion control needs to be considered. If there's some strong implementation reasons to keep it in the low 100s of bytes, perhaps that'd be workable for the scenarios we currently know about (a handful of transforms + some user data). But as Jerome says, if we can avoid making devs worry about this limit, that'd be great.
Chrome (and Edge) WebRTC implementations now have experimental support for the Insertable Streams API.
This API makes it possible to communicate metadata along with audio and video, keeping it in sync.
A presentation on the API can be found here.
An article on its use in AR is here
Is it possible to support this API, given that it requires passing parameters within
RTCConfiguration
?