Open juunini opened 2 years ago
Hello,
It's normal. Basically, hubs allows the possibility of having a maximum of 25 people in a single room. Depending on the device used (example: android phone) but also on the 3d environment in which you are, this can cause latencies. For there to be 25 people without sound problems, the environment must be the most optimized with simple avatars but also a good connection for each user.
Hi, thanks for the problem report.
Yes, the maximum capacity of a room is 25. Maybe no audio quality guarantee for over 25 people.
https://hubs.mozilla.com/docs/hubs-room-settings.html#maximum-capacity
But 20 is under that maximum capacity. Probably worth to dig into the problem especially if the problem can even happen with fewer people in a room.
So let me ask some questions to investigate the problem more deeply.
@takahirox
@juunini Hi, thanks for the answers. Do you mind if checking the "Disable audio left/right panning" option checkbox in the preferences and checking if the problem can be mitigated? You can find the option in "More" - "Preferences" - "Audio" tab.
I tested and confirmed that the audio can be very choppy or even I hear nothing if people are many (about ten or more) in a room on my Samsung Galaxy S8+ + Chrome.
I realized that the problem can be worse if I move or rotate. So I speculate the problem can be related to Positional Audio (left/right panner), especially if CPU (or memory?) pressure is high. Actually if I check "Disable audio left/right panning" the audio problem is very mitigated.
If my speculation is true, potential solutions or workarounds may be
Can I ask for volunteer, android device users? Would you try the following steps and check if you can reproduce on your device? My device is Samsung Galaxy S8+ and it supports up to Android 9, doesn't support newer ones. So I feel there may be a chance that this Positional Audio problem can happen only on old Android.
# replace roomId and roomName with your room's ones
$ ./run-bot.js -u https://hubs.mozilla.com/roomId/roomName -a ./bot-recording.mp3 -d ./bot-recording.json
In my Samsung Galaxy S8+ Chrome the audio problem happened with about ten or more bots in a room. I entered the same room from Windows 10 Chrome and iPhoneX Safari and didn't see the problem on them. (I guess the reason why the problem didn't happen on my iPhoneX might be that we force to disable the Positional audio on iOS?)
I did try and run the bot script, but it appears puppeteer isn't working currently on M1 Macs :-(
I have been looking into this today using Galaxy Samsung S8 with Chrome. These were my tests:
After this test I think it may not be related to the amount of avatars in the room (so not related to the WebRTC stack) and it's probably more related to the Panner Node performance in mobile devices or WebAudio pipeline issues.
Things that I'd try:
@rawnsley Oh
@keianhzo
Thanks for the trial and comments. It sounds like we reproduced the same audio problem. What I want to know first is if this likely Positional Audio problem can be reproducible on newer Android devices at least which support the latest Android version, to determine how much we should take effort for investigating the problem. So waiting for volunteers who have newer devices.
And the problem seems to be very related to the performance poorness of maybe both (Positional) Audio related and unrelated so I'm thinking of finding the Audio related and non-related performance bottlenecks and trying to optimize them.
Tried this on my pixel 6 pro on Android 12
In Firefox I start getting noticeable crackling at around 4 bots, much worse at like 7, very bad > 10 In Chrome some minor crackling around 10, starts to get pretty bad at 13, rapidly gets worse after that
Some of what I am hearing just sound like the audio decoder is struggling to keep up, so I wonder how valid this testcase is. Its pretty rare you will have so many people actively speaking at the same time. Maybe a mix of speaking/non-speaking bots might be a good test case.
It might also be worth looking into adjusting the panning model on mobile. Three's PositionalAudio sets it to "HRTF", but I believe that is a good deal more costly. https://developer.mozilla.org/en-US/docs/Web/API/PannerNode/panningModel
Related to this, we looked into google's resonance audio SDK a very long time ago, it supposedly scales better with many audio sources https://resonance-audio.github.io/resonance-audio/develop/web/developer-guide though it would seem we should be under the number of sources where the performance breaks out.
With panning disabled: in Chrome I get up to around 11 without any issue but then it very quickly broke down after that. Not so much crackling but lots of audio skipping In Firefox I start to get some minor crackling around 8, gets pretty bad around 10 or 11, and then quickly breaks down after that
So it was slightly better without panning in Chrome, but more noticeably in Firefox. The full breakdown point in Chrome seems to be about the same, and sounds less like PannerNode issue and more like decoding issues like I described above.
(note its also pretty hard to tell if things are working right just because its impossible to actually listen to that many people talking at once and make sense of it hehe)
Galaxy Z Fold3 in same issue.
If scene is museum Not many people say. But many video say. T.T
@netpro2k
Thanks for testing. I want to raise the priority of this issue because it's reproducible with about 10 people in a room even on Android 12.
Maybe a mix of speaking/non-speaking bots might be a good test case.
As we discussed internally, yes most codecs seems to have a concept of empty frames and WebRTC also has a way to flag a silent stream and doesn't even send data. Probably the less mics are on or less people speak in a room, the less we have audio decode pressure.
But note that this problem can even happen in practical use cases. I joined a few rooms where there were 10-15 people on my Android, and the audio was very cracky or even I couldn't hear anything. (There might be a chance that my old Android 9 can cause the proplem much easily.)
It might also be worth looking into adjusting the panning model on mobile. Three's PositionalAudio sets it to "HRTF", but I believe that is a good deal more costly. https://developer.mozilla.org/en-US/docs/Web/API/PannerNode/panningModel
Cool, it should be worth to dig into the positional audio parameters.
So it was slightly better without panning in Chrome, but more noticeably in Firefox.
Good to know that disabling the panner node can mitigate the problem. But bad to know that it isn't a perfect solution.
As @brianpeiris mentioned internally, iOS likely had the same problem and fixed, it won't be surprising if Android has the similar problems.
So in middle or long term we should (make micro tests which can reproduce the problem and) report to the Android devs. (I hope) the fact that iOS likely fixed the problem can push them.
Let me keep working on this problem.
Update: As we discussed internally https://github.com/gfodor/hubs-mpl/blob/master/src/systems/avatar-audio-track-system.js can help
I made a micro test which can reproduce the problem.
https://takahirox.github.io/pannernode-test/index.html
Can I ask for volunteers again, Android 12 users? Would you please try the test and check if the audio can be broken on your devices?
On my Samsung Galaxy S8+ Android 9 Chrome, the audio is crackled with 14 or more panners with HRTF panning model, and no audio with 30 or more panners.
I want to report the problem to the Android devs once we confirm the problem can be reproducible in the test on Android 12.
Some notes:
All of these tests are on my Pixel 6 pro running Android 12:
Chrome: With HRTF, moving nodes, no added CPU load I can get up to about 30 without much crackling 30-50 crackling starts to ramp up, worse when interacting with the page Above 50 crackling continues to get work, and audio dies when interacting with the page Above 80, no audio
Chrome: With Equal Power, moving nodes, no added CPU load I can get up to about 40 or 50 without too much crackling, continues to ramp up fairly linear for the checkpoints I tested Audio plays all the way up to 100 nodes though with a lot of crackling and cutting out, interacting with the page completely cuts out the audio for several seconds, but it comes back.
Chrome: With HRTF, non-moving, no added CPU load I can get up to about 40 without too much crackling Audio plays all the way up to 100 nodes with a decent amount of crackling but no dropouts., interacting with page does not break things.
Chrome: With Equal Power, non-moving, no added CPU load I can get up to about 40 without too much crackling Audio plays all the way up to 100 without the crackling ever getting too bad (hard to tell at the high end since so much audio is playing), interacting with the page does not break things.
I didn't run all the tests in Firefox because things start to fall over pretty quick. Let me know if running more tests in Firefox would be helpful Firefox: With HRTF, moving nodes, no added CPU load At around 5 I already start to get some crackling and distortion getting bad pretty quickly above that I can play up to 100 but there is massive crackling and distortion. The audio also appears to be playing back at a lower pitch and rate. Interacting with the page does not seem to effect things.
Firefox: With Equal Power, non-moving, no added CPU load Craclking starts to get pretty bad around 30 or 40 At 100 pretty massive distortion and some crackling, but not the same slowed playback rate as above.
Filed an issue https://bugs.chromium.org/p/chromium/issues/detail?id=1308962
Perhaps it takes at least some months to resolve the problem. I want to think of our short-term workarounds tomorrow.
Thanks for testing, all.
As a short term workaround I think disabling panner node by default or forcing to disable it on Android device is good because it gives huge improvement and the change should be small.
Actually I entered a room having 18 people today on my Samsung Galaxy S8+. I couldn't hear with panner node enabled. By disabling it audio became fine.
We can think of other workarounds later if disabling penner node isn't good enough.
Switching panner model is interesting to me. But I want to think of it separately from the Android audio problem but as optimization.
Hi, old hubs/jel use this positional updateMatrixWorld hack: https://github.com/jel-app/jel-mpl/blob/master/src/hubs/utils/threejs-positional-audio-updatematrixworld.js
maybe this could be updated in more stable way
do you know, not using linearRampToValueAtTime in the matrixUpdateWorld solves the Android Problem?
https://github.com/mrdoob/three.js/blob/dev/src/audio/PositionalAudio.js#L116
have done some testing on android 12 pixel https://github.com/arpu/pannernode-test/commit/11ce03a59050a32f462a5aa583f6a79ef6d4918d https://arpu.github.io/pannernode-test/
with Equal Power, moving nodes, no added CPU load i get 60 panners running fine on 60fps
With my S8 Android 9 Chrome 100, with https://takahirox.github.io/pannernode-test/index.html or https://arpu.github.io/pannernode-test/ I have the same results, with move panners and listener checked, none additional main thred CPU stress. With HRTF I start having audio issues around 16 panners. With equalpower, I can have some additional panners before I start having audio issues, around 23 panners.
Can we fix this by limiting the maximum number of sound sources a single user can receive? For example, it is calculated based on distance and sound volume, and only the 5 loudest sound sources are received.
We have introduced the Audio PanningQuality preference in #5454 Try low quality mode. It isn't a perfect solution but may metigate the problem.
I made a PR to use low audio panning quality mode by default on Android #5540 I hope we go with it and will see if the problem can still happen in practical use cases.
Description
To Reproduce Steps to reproduce the behavior:
Expected behavior
Screenshots no screen shot. sound issue.
Hardware
┆Issue is synchronized with this Jira Task