Hubs-Foundation / hubs

Duck-themed multi-user virtual spaces in WebVR. Built with A-Frame.
https://hubsfoundation.org
Mozilla Public License 2.0
2.13k stars 1.42k forks source link

[Epic] Improve audio spatialization behaviors for rooms and individual users #1853

Open misslivirose opened 4 years ago

misslivirose commented 4 years ago

Is your feature request related to a problem? Please describe. In rooms with larger groups of people, it can be difficult to have multiple conversations going on simultaneously. It is also occasionally difficult to have audio streams playing while conversations might also be happening in the background. This was something that occurred somewhat regularly during the UIST sessions.

Right now, we support setting fall off on videos, but we don't have settings that are configurable more widely between different spaces or between users other than the individual audio controls on the avatar pause menu. There are many complex configurations around a fully featured 3D audio attenuation system to consider with how this could be done.

Describe the solution you'd like At a basic level, room owners, or perhaps at the scene authors, should have additional control over the falloff / attenuation of the entire space. Another alternative to consider (if technically feasible) could be that users can select their own falloff for video and/or avatars - although this changes the contract that each user would experience a space the same way. We would need to be cognizant of the impact of that.

This is also generally related to user settings and accessibility, depending on the ultimate design and direction of the implementation.

Contributing This issue needs a solid design and requirements outline. At the core, the challenge to solve is giving more control to make audio levels more customizable so that larger spaces can avoid having so much overlap between multiple sources, be them conversations or video feeds.

Additional context Related: #1393

┆Issue is synchronized with this Jira Task

misslivirose commented 4 years ago

An additional design could be to add an indicator when someone was within hearing range, or have a "cone / sphere" of silence option that could be used to make small bubbles of audio privacy.

gfodor commented 4 years ago

pause mode is getting somewhat crowded, but a light visual treatment showing the range of earshot will probably be more 'backgrounded' relative to menus etc so maybe not to bad to try to add there.

emclaren commented 4 years ago

Just wanted to add that an organizer for a recent community event also requested a version of this, they wanted to make the sound of a video audible through their whole space, not just when standing very close to the screen.

brianpeiris commented 4 years ago

FYI, you can already set the falloff on videos in Spoke.

As for voice audio, I would suggest we spend time tuning our falloff factors before doing any additional work on settings. AFAIK, we never spent any time tuning them at all, so it's likely that we can improve things easily. It's already clear that our falloff is wrong when a couple of groups of people are talking to each other. Their falloff does not match what you would expect in real life.

gfodor commented 4 years ago

Also, the current audio volume controls on avatars are barely a v1. I feel pretty strongly that a low friction mechanic with good feedback for rapidly adjusting audio on a per-avatar basis could potentially mitigate issues here. For example (not proposing this) but pointing at an avatar and rotating your wrist could adjust volume in a way that doesn't require much conscious though or interaction and could lead to users continually, easily adjusting volume as needed.

emclaren commented 4 years ago

At last Friday's meetup we used a scene with background audio added in Spoke (https://hubs.mozilla.com/scenes/jxuznHE/dragonroot-island), and although for some participants the audio levels were fine, for others it was very loud. I tested on Quest and on Desktop and the difference was significant.

We had to change the scene to be able to continue having conversation. It would be helpful if people could have control over the ratio of volume of environmental sounds to volume of voices in the room so participants can set levels appropriate for their system and personal needs. Particularly since users have no control over sound clips that are built into the scene using Spoke

gfodor commented 4 years ago

I think this does surface three adjacent problems:

If we addressed all three of these, that seems like a reasonable solution to this problem in a general sense. It seems better to solve this problem in the general sense for media, since if we focus on just scene media then room-spawned media will have the same issue(s).

Also I believe that you can change the audio on scene owned media, if you can actually see the media to access the controls. So to me the root design flaw is no way to actually control media you don't have nearby visible access to. (And perhaps there are other issues with our falloff in addition to this, but we'll need to investigate to understand what was going on wrt platform varying.)

blairmacintyre commented 4 years ago

That list helps with the object volumes, but not the people's volumes.

(I'd add to your list that there should be a checkbox on media items that turns off distance attenuation)

At a basic level, the pure distance-based falloff is simply a broken metaphor, when combined with the lack of environmental audio adjustment: in the real world, I can step behind a barrier and chat with you and most of the audio on the other side of the barrier is muted. Not here.

We need to manage this better, otherwise the rooms always have an unpleasant murmur.

emclaren commented 4 years ago

I gathered feedback from participants in today's Monday Project Meeting through a google form. In the "other feedback" section three of four respondents mentioned audio volume unprompted:

  1. It's hard to get volume quite right - even at max speaker + on youtube controls it was too quiet.
  2. It was helpful getting information on how to raise the volume!
  3. The volume and placement of the streaming window is a persistent pain point.
misslivirose commented 4 years ago

I'm going to use this as a bucket for other related discussions about audios to keep conversations consolidated.

From #2259:

Currently, all the audio is sent to all the players and is spatialized locally. That uses up a lot of bandwidth (both on the server and the clients) and a lot of CPU, and limits the number of users who can be in a room at once.

On the server side, you could do a simple distance check and not send the audio beyond a certain distance. The check could be very lightweight (use the squared distance, or even just the Manhattan distance).

From #2294:

misslivirose commented 4 years ago

From #2272:

At conferences, it's not unusual to have someone making an announcement or giving a presentation. However, with distance attenuation, people will have trouble hearing the speaker if others are talking close by.

It would be nice if a room owner could set someone as a speaker, which would cause their audio to not be attenuated by distance. It would still be spatialized in terms of stereo panning, it would just be at a constant volume through the space. You could have multiple people set to be speakers, so you could have a panel discussion in front of an audience.

People would still be able to chat with people near them, but everyone would hear the speaker.

misslivirose commented 4 years ago

From Blair in #2334, suggestions on where the room override settings should live:

Is your feature request related to a problem? Please describe. Different audio settings are appropriate for different contexts (e.g., meetup with a speaker and shared video/slides that everyone wants to hear, social gathering where people want to split into small groups, co-watching shared video). Having to go into the scene editor to change these settings is too much friction, so people likely won't (or don't even know) if they can do it.

Describe the solution you'd like These scene level options should be settable when the scene is USED to create a room. They should be settable as options on the room that override the scene global options.

Describe alternatives you've considered Setting in spoke.

Additional context Being able to take a scene, especially one that isn't remixable, and use it in different contexts is super important. And I don't want to have to create different scenes for each use; I'd like to have one scene that can be updated, but the various uses that modify the global audio settings still work.

One thing that's needed: media elements in hubs scenes do not seem to always obey the default media settings, possibly when the defaults are changed after the media object is created? There should be a flag on media objects to explicitly override audio settings or not, much like the global checkbox to set global properties.

emclaren commented 4 years ago

Just wanted to mention I've had community feedback 3 times in the last week asking for the ability to "mute all" in a Hubs room.

emclaren commented 4 years ago

Just wanting to mention I've had more community requests for the ability to "mute all". Many of which are from large-ish events which may take place in hubs.

joshkeys19 commented 4 years ago

We are using hubs for large gatherings of students to teach coding and are having the same issues. The ability to mute all and have the moderator's voice heard above the rest would vastly improve the feasibility of using hubs as a virtual classroom replacement in these times. It is such a fantastic platform with so much potential!

emclaren commented 4 years ago

I'm receiving this feedback about "mute all" at least once every few days. I wonder if at the very minimum we could implement a two features as a stop-gap measure until we improve audio behaviours more broadly:

1) a button to mute everyone non-admin in the room quickly

2) a feature in the room preferences that allows only admins to unmute themselves

emclaren commented 4 years ago

At a talk in Hubs recently, a user received a personal phone call mid-session. As a moderator it was really hard to figure out who to mute in the room (particularly with mouth-moving avatars (#2533), especially if they are just broadcasting accidental quiet ambient noise by mistake)

Since I was also the "camera person" the people in the lobby couldn't watch the presentation while I searched

Having a "mute all" for the audience as described above during the presentation would have eliminated this problem .

emclaren commented 4 years ago

Received feedback from a community member following their first experience in Hubs (@lordbron in Discord), they provided a very thorough description of their experience in Hubs. They reported having trouble knowing when people were speaking. They were on Quest and found they couldn't hear if groups of people were speaking unless he was very close to them (Quest tends to be very quiet for me as well). They shared this feedback:

" it would also be nice to have a visual representation above a group that's in proximity. This would give the indication of if there's dialogue happening at the moment. Could be a fog or a color gradient, but something that a newcomer to the platform could associate with sounds (guessing that would also be a good accessibility thing for those who cannot hear so well.) I would say you can make it a toggle users can turn on or off,"

emclaren commented 4 years ago

The user I mentioned above also mentioned a loudspeaker option:

"Room owners should be able to give speakers louder volumes than attendees. During Q&A, the room admin should be able to toggle something that would allow guests asking questions to have their volume auto turned up as well so they could be heard. I really do think that this is Quest thing. I noticed on my desktop that I was hearing like background music loud and clear, but I didn't get a chance to hear people speak during that brief session before I hopped onto my quest device."

This is a feature we get asked for frequently when meeting with community members

emclaren commented 4 years ago

Received a lot of feedback from the XR Access symposium about audio. Some comments about how it was hard to differentiate speakers in busy spaces.

Others suggested it was unclear who could hear them: "Make it clear how information is shared. For instance, at what distance can people no longer hear my voice?"

emclaren commented 4 years ago

We received another request about a loudspeaker option for room moderators.

emclaren commented 3 years ago

Just nothing that we continue to receive more requests for a loudspeaker option.

Recently the team from rem5forgood shared this request: A megaphone option where presenters can reach the entire audience, and two-fold on that, being able to override audio settings for the group if you move from a social hour to a presentation and want to be able to adjust between two settings.

emclaren commented 3 years ago

In a meeting with an educator and they are asking about loudspeaker options, and breakout rooms that can be very quiet for users

JacobErvin commented 3 years ago

Just nothing that we continue to receive more requests for a loudspeaker option.

Recently the team from rem5forgood shared this request: A megaphone option where presenters can reach the entire audience, and two-fold on that, being able to override audio settings for the group if you move from a social hour to a presentation and want to be able to adjust between two settings.

Seconding this. We're running multiple events per month, often featuring speakers / hosts and this feature is requested at every event, by multiple attendees.

emclaren commented 3 years ago

Just popping in to mention that loudspeaker mode came up at the A11yVR meetup this morning

vhelin commented 3 years ago

Any news about this? I'd need a way to make the user's voice propagate no further than ~10 virtual meters away from the user...

If no-one (currently assigned to no-one) is working on this, any hints on how I could do this myself? Modding Hubs is something that I can perhaps, hopefully, be able to do (working on a customer's case), but if I can just add more JavaScript to the rooms to do it, perhaps manipulate existing objects on the fly, that'd be the best option...

rawnsley commented 3 years ago

Any news about this? I'd need a way to make the user's voice propagate no further than ~10 virtual meters away from the user...

If no-one (currently assigned to no-one) is working on this, any hints on how I could do this myself? Modding Hubs is something that I can perhaps, hopefully, be able to do (working on a customer's case), but if I can just add more JavaScript to the rooms to do it, perhaps manipulate existing objects on the fly, that'd be the best option...

@vhelin if that's literally all you need it could be achieved by overriding the audio properties of the scene in Spoke. If you set the model to "exponential", the ref distance to 10m and the roll-off factor to something very large like 1000 it should work:

Screenshot 2021-03-25 at 15 28 21
vhelin commented 3 years ago

@vhelin if that's literally all you need it could be achieved by overriding the audio properties of the scene in Spoke. If you set the model to "exponential", the ref distance to 10m and the roll-off factor to something very large like 1000 it should work:

Screenshot 2021-03-25 at 15 28 21

Awesome, it works! Thank you a lot, you made my day. :)

keianhzo commented 3 years ago

@emclaren I'd like to start working on this but the scope is quite broad. I'll like to start narrowing it down to something more concise.

blairmacintyre commented 3 years ago

@emclaren I'd like to start working on this but the scope is quite broad. I'll like to start narrowing it down to something more concise.

Audio fixes! Yay! 👍