goldfire / howler.js

Javascript audio library for the modern web.
https://howlerjs.com
MIT License
23.78k stars 2.23k forks source link

Support additional positional methods for Web Audio #41

Closed ndarilek closed 10 years ago

ndarilek commented 11 years ago

I'd like to see methods for setting velocity and orientation on howls. I'd also like to see some means of setting a listener position, orientation and velocity.

If there are no plans to do this within the week, I may have time to hammer out a pull request, but in order to reduce friction in having it accepted, it might be nice to discuss implementation here. Specifically:

As an aside, it might be a good idea to create a new method for setting position and deprecating pos3d, since this feature would add many additional 3-D attributes to sound. Maybe reserve position() for 3-D sound, add offset() as a synonym for pos, then deprecate pos/pos3d for a future 2.0 release? I won't implement that now (except for maybe the position() synonym if asked) but I'm tossing it out there as a thought.

erichlof commented 11 years ago

Hey Nolan, Turns out we are both in Houston, TX! Howdy partner :-)

I would love to see your ideas about the additional positional methods come to life in a future Howler.js release. I own Boris Smus' Web Audio API book ( demo page here: http://webaudioapi.com/samples/ ). He also has a github repo here: https://github.com/borismus/webaudioapi.com

I've been fooling around with his examples and it looks pretty trivial to set Position, Orientation, and Velocity of both listener AND sound source inside the Web Audio API. Literally, it's only like a couple of lines of code (props to W3C for making such excellent design choices for their API ! ). However, and James could clarify, this is not so trivial once you have to work this into an existing sound library such as Howler.

I was the one who requested the pos3d method, but I wouldn't mind it being deprecated in favor of a more robust method like the one you are proposing. I think your method naming is right on track and easy to understand for sound library noobs like me. I am also in favor of renaming 'position' (meaning tape or track position) to something else less confusing with 3d position (such as 'Offset' like you suggested). Or if renaming position would be too troublesome, since we're starting from scratch basically with the 3D stuff, how about a member name of 'Location' for our 3D positional value - ( listenerLocation(x,y,z) and howlerLocation(x,y,z) ). That most likely will not be confused with track/tape position used elsewhere in the Howler library, and would not require a re-write.

What are your thoughts James? How difficult is adding new methods and members to your library? As I mentioned elsewhere, I have experience programming small 3D games in C from scratch, but I have never written a library for public use, so I really don't know how much effort and re-working is involved with Howler.

goldfire commented 11 years ago

This would obviously involve some significant changes and a new major release. I'm certainly in favor of adding more of the advanced Web Audio API features into the library since when Firefox gets support this summer we'll have a majority of the population covered. However, I'm swamped at the moment (we just launched the beta of our new game a week ago), and I want to think through all of this in some more detail before making any major changes. I hope to have some free time over the weekend or sometime next week.

erichlof commented 11 years ago

Hey James, Good to hear from you - Thanks for checking in. I totally understand that you are swamped at the moment with the release of your game. Congrats on that by the way! It looked cool on your promo YouTube video. :)

About the changes, yes we should flesh out all the details such as naming conventions and basic functionality (what we would want out of the final positional library) before you have to change things around and make a new release.

I will keep this thread/issue alive by posting some naming conventions that I think would be non-confusing and helpful to the users. Maybe Nolan can chime in too as this will be for everybody to eventually use.
Also, basic functionality as was stated a couple of posts above by Nolan (ndarilek) would be great. I will restate everything concisely and clearly so you and the users of Howler know what to expect from the new methods.

Good luck with the Beta! I will be checking back soon with more details.

erichlof commented 11 years ago

I'm back with some suggestions to get this whole process started. First I guess we need a basic overview of functionality that the library users might want. Here's a brief list of capabilities - we should be able to:

Howler sound

  1. Set the Location (x,y and z coordinates) of a Howler sound in the 3D world.
  2. Set the Orientation (unit-length vector x,y,z) of the Howler sound. I'm not sure if Web Audio API requires a normalized unit-length vector or not - will have to research that. Anyway, this is useful if you have a horn or siren in the game and that horn/siren is spinning or is in a certain fixed orientation relative to the listener. When it's pointed away from you, it will sound muffled and fainter. When it turns toward you it will be brighter and louder. Web Audio calculates all this for us - would be a really cool effect to have!
  3. Set the Velocity (vector x,y,z) of the Howler sound. Same principal here, except we don't want a unit-length vector here. The longer it is in a certain direction, the faster it is moving every frame. So x = 0, y = 0 ,and z = 1 is different than x = 0, y = 0, and z = 100. In the later case, the sound source is moving much faster in the positive z direction. Also this method would be cool because Web Audio API automatically calculates Doppler pitch shift (i.e. approaching ambulance that rushes by).

Player/Listener

  1. Set the Location of the player/listener (x, y, z coordinates).
  2. Set the Orientation (unit-length? vector x, y, z) of player/listener. This might require additional arguments such as an Up vector (which is usually 0,1,0 - pointing straight up out of player's head), I'll have to check the API. Anyway, this is similar to orientation of Howler sound source above. It would be nice for a FPS game with mouse-look. As the player moves the mouse around and tilts their head, the sound source will pan and dull or brighten accordingly just as in real life.
  3. Set the Velocity (vector x,y,z) of the player/listener. If you're traveling fast in a car or bike, sounds (even if they are stationary) should pan, dull/brighten, and doppler shift accordingly. Again, Web Audio API takes care of all the math for us.

Please chime in if I've forgotten anything. That should cover most 3D scenarios though. I will post back later with some suggestions for naming the methods and what arguments should be accepted by those methods.

Thanks James and all! -Erich

erichlof commented 11 years ago

I just reviewed the Web Audio API specification for spatial 3D audio, which can be found here: https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#PannerNode (Look under PannerNode section).

Upon studying this spec, it looks like we would have to add a couple of more members to the Howler source sound. So, in addition to 1,2,and 3 in my list above, here is some additional items that I think are necessary:

Four. Set coneInnerAngle (0 to 360 degrees, default is 360) of the Howler sound. Inside this angle or window, there will be no volume reduction. Five. Set coneOuterAngle (0 to 360 degrees, default is 360) of the Howler sound. Outside this angle or window, there will be a volume reduction specified by the following 'coneOuterGain' value. Six. Set coneOuterGain (0.0 to 1.0, default is 0) of the Howler sound. Once you step outside the 'coneOuterAngle' above, the gain or volume of the sound will be set to this new value. 1.0 would give the same value that the sound already has (and would be kind of pointless), 0.0 would cut off the sound completely. So maybe 0.5 would be a good place to start, giving a half volume when you are outside the directional sound cone.

I'm not 100% sure how the Inner vs. Outer angles work. I will read up on it and try to put it into context with a visualization.

By the way, the Orientation vectors for sound source as well as Listener do indeed need to be normalized to unit length. The Web Audio API book that I referred to earlier (by Boris Smus) confirmed this.

Back later with more. :-)

erichlof commented 11 years ago

I think I understand how the cones work now. Let's say in your game you have an outdoor air-raid Siren that is shaped like a horn or cone. It sits on a pole and is pointed in a certain direction.

The innerConeAngle is the smaller area (a cone) of sound that defines full volume. This is because it is in the same direction as the Siren is pointed. If you were able to climb the tower and put your face directly in the way of the bell/cone of the Siren (where you could peer directly into its innards), this would be full intensity.
Now let's say you slowly start to walk to the right of the alarm siren.

As soon as you leave the innerCone, the sound begins to drop in volume as you approach the outerCone. The outerCone defines the larger area (also a cone) of volume Transition. Unlike the innerCone, which is always full volume, the outerCone transitions from full volume (nearer the innerCone) smoothly to a user-defined volume of coneOuterGain, nearer the limits of the outerCone. For example, if you have the coneOuterGain set to 0.5, as you walk to the right of the siren, it will transition the volume from 1.0 (full volume) through 0.9, 0.8, 0.7, 0.6, until you reach the limit of the outerCone and it settles on 0.5. Obviously the API gives a smoother transition than what I wrote, but you get my point.

Now if you keep walking to the right, you will leave the outerCone and be in the general outer area, At this point you are not in direct line of sight/hearing of the siren, and the volume stays at 0.5, no matter where you walk. As long as you don't walk back in front of the siren's outerCone, it will remain at 0.5.

Hope this makes sense. There is a cool demo on the Html5Rocks page illustrating this using the Three.js library for rendering the 3D graphics: http://www.html5rocks.com/en/tutorials/webaudio/positional_audio/orientation.html

You can use WASD keys to move, and drag with mousebutton1 to 'free-look' FPS-style. I'm not really sure why the outerCone shown in red is facing opposite the green innerCone. Oh well, cool demo none-the-less.

Please chime in if I've missed something or misunderstood something about how the Web Audio positional API works. I will post later with some suggestions for naming conventions.

erichlof commented 11 years ago

Hi James and all, James I know you're probably pretty busy right now, but I wanted to keep this thread alive by finally posting some naming conventions for the new 3D positional functionality of Howler. First, let's think about the Howl object. Using my list above, here are some method names and their arguments that I think will be clear and pretty much self-explanatory (with the exception of the sound cone business) to the end user:

  1. Howl.setLocation(x, y, z); //Sets location of sound source with x, y, and z being coordinates in the 3D world.
  2. Howl.setOrientation(x, y, z); //Sets orientation of sound source (which way it is looking) with x, y, and z representing components of a unit length vector. This vector is pointing in the same direction as the sound source. It must be of unit length, or 1. So, (0,1,0) will work because that vector is 1 unit long, but something like (0, 230, 6.4) will not because it has not been normalized yet. There are many math libraries to do this, although we might want to do this 'in-house' by having this method automatically normalize any incoming vectors automatically for the user. That way, the user just provides the direction vector x,y, and z, and .setOrientation() first checks to see if it is normalized and if not, then it performs the 3 lines of code to normalize it to a unit length of 1. The code to do this is in every 3D vector library - we could just borrow the code from elsewhere.
  3. Howl.setVelocity(x,y,z); //Sets the velocity of a moving sound source with x, y, z representing both a direction as well as length(speed) of the vector. This is not normalized because arguments of (0,1,0) and (0,1000,0) will result in same direction of sound source but at very different speeds. In other words, the bigger or longer the vector, the faster the sound source is moving through the 3D world.
  4. Howl.coneInnerAngle(degrees); //Sets the inner angle or window in which the sound plays at full volume. The argument 'degrees' is a number from 0 to 360, with 360 being the WebAudio default. In other words it plays equally in all directions by default and is not directional by default. As long as the listener is inside this cone or window, they will hear full intensity. A narrow setting like 10 will produce the audio equivalent of a bright directional stage spotlight.
  5. Howl.coneOuterAngle(degrees); //Sets the outer angle that the sound will drop down to (specified by coneOuterGain next); Again 'degrees' represents a number between 0 and 360, 360 being the omni-directional WebAudio default. A smaller number like 20 will drop down quicker and a larger number like 180 will provide a bigger cone or window, and will have a smoother reduction of volume as the listener nears this angle or limit.
  6. Howl.coneOuterGain(volume); // Sets the volume when the listener is outside the cone or window specified by outerCone above. The argument 'volume' must be a number between 0 and 1, with 0 being totally muted, and 1 being full volume. Rarely if ever should you use 1 as this will not make a difference in your sound's directional capabilities. A number less than 1 should be used. Again, with the spotlight analogy, if you step outside the bright light on stage, how dark (or muted) should it get? A setting of 0.1 will be almost no sound able to be heard (good for a siren or bullhorn). A setting of 0.7 will be a little muted but not completely (good for walking around back behind a person while they are talking).

Next post will be Listener capabilities:

erichlof commented 11 years ago

Ok last post before I let others especially James chime in. I don't want to hog this discussion :) Listener object methods:

  1. Listener.setLocation(x,y,z); // Just like number 1 above for the Howl sound source, except now we are talking about the listener's or player's settings. Set the listener's position with default being (0,0,0).
  2. Listener.setOrientation(lookX, lookY, lookZ, upX, upY, upZ); // Sets the direction in which the listener is facing. lookX, lookY, and lookZ represent components of a vector pointing in the direction of the listener's nose, or in other words, where their head and gaze are looking. WebAudio requires unit length vector, so we just copy and paste the normalize code into this function as well. upX, upY, upZ represent which way is up, or the amount of tilt of the listener's head. By WebAudio default, it is (0,1,0) where you can imagine a stick emanating from the top of your scalp and pointing straight up towards the sky. I think a setting of (0.5, 0.5, 0) would have the listener looking down at his/her shoes. The stick would now be pointing straight into the computer screen. Again, we need the normalizeVector function for this 'up' vector as well, because WebAudio demands it be of unit length.
  3. Listener.setVelocity(x,y,z); // Like number 3 above, sets the velocity of the Listener now. Is NOT normalized, as a longer vector will mean faster speed/movement each frame. WebAudio will take this as well as the earlier Howl.setVelocity() and magically provide Doppler pitch/intensity shift effects to the end result.

At this point I'll let others have their say. As always comments and suggestions are welcome. Good luck James with your game! Hope to hear from you guys soon. :-)

ndarilek commented 11 years ago

Finally have a bit of time to sit down and work on this. Two things:

  1. Sorry if I missed this, but what should be done about unsupported methods--position methods when using the audio tag, for instance? Thinking they should just be no-op, and it's up to the game developer to a) not play sound if Web Audio is required or b) produce an alert about degraded support.
  2. One minor nit, and I won't bikeshed it, is that I still prefer "position" instead of "location" if only because it aligns the names with the Web Audio spec. If folks prefer "location" then that's fine too, I just like being in sync with the spec where possible.

Anyhow, I'll try to get a pull request together in the next few days. Thanks for doing the legwork on hashing out solid names/requirements. My initial pull may not support everything since I don't need it all, but I'll take a crack at implementing position/velocity/Doppler settings.

P.S. I'm actually in Austin, have been for a few years. Shows how rarely I update my profile. :)

erichlof commented 11 years ago

Hi Nolan, Thanks for taking a crack at a pull request. Also, I agree with keeping naming conventions close to the API, such as 'position'. The only reason I came up with a synonym 'location' is that 'position' is used elsewhere in Howler to mean 'tape position' (that dates me :-) ) or 'playback position' of a sound. I didn't want possible confusion to ensue or James to have to change all the old code instances of 'position' to 'playbackPosition' or something. As long as neither of these is a problem for him and most others, I'm totally fine with your recommendation of 'position'.

So far as number of methods go, for most use cases, position and velocity of both sound and listener will be used, and in the case of a first person game, the orientation of the player/listener would be important also. But if you can't get around to sound source orientation and its sound cone (the two go hand-in-hand), don't worry about it. That was just for full implementation. I must admit, I can't think of when I would need a spinning alarm siren with a sound cone in a future game, but I guess someone else might. :)

p.s. Austin is a great city. Some of my extended family is there and they love it. Good to know that you're still in Texas - yeeHaw!

erichlof commented 11 years ago

Hi James, While this 3d panner stuff is in revision, I want to propose that we change the underlying equation that the Web Audio API uses to determine the final output. This is easily done with one line of code, but may make a big performance difference for older computers and cell phones (once Web Audio is implemented on all mobile platforms). The reason I bring this up is that I read the following on the W3C Web Audio Spec page:

"In addition to the convolution effect, the PannerNode may also be expensive if using the HRTF panning model. For slower devices, a cheaper algorithm such as EQUALPOWER can be used to conserve compute resources."

Here's a link: http://www.w3.org/TR/webaudio/#Simplification-of-Effects-Processing

By default, when Howler utilizes the 3D panner, it is set to the most processor-intensive HRTF model. I don't think casual gamers (and even the programmers) will notice if we just use the EQUALPOWER model instead. As long as the sound pans from left to right and front to back convincingly, the speed-up on slower platforms outweighs accuracy IMO.

Do you all agree? This would be easy to implement as it just requires setting a flag when the panner node is created in Howler.

ndarilek commented 11 years ago

I would like the ability to either a) explicitly set the panner mode or b) set a simple highQuality property which toggles between the two settings. My games are sound-focused, so using high-quality audio where available is key.

I say we go with the defaults, then add a property should it prove to be a problem.

For the record, I've played sound-based games that use HRTF processing on my 4G iPod, pretty slow by current device standards, and haven't encountered issues. We should look to that if folks report performance problems, but perhaps not before then.

erichlof commented 11 years ago

I like the idea of the highQuality = true/false property. That way, game developers could target the appropriate tech for the particular project they are working on. You're right though, maybe we won't need to do anything about this, and even old devices and cell phones (once Web Audio is universally accepted) will have no problem with the default HRTF model. The W3C audio group was just covering all possible bases with their suggestion I guess.

Thanks!

goldfire commented 11 years ago

Just wanted to post an update here to let you guys know that this hasn't been ignored. All of the feedback in this thread has been really great, and this will certainly be the focus for the next major version. Unfortunately at the moment I'm incredibly swamped as we are a few weeks from the public launch of the game we've been working on for the last year. Once we are past that I'll be able to get going on these ideas more seriously.

erichlof commented 11 years ago

Hey James, no problem. Thanks for checking in. I knew that was the case with your development team getting ready to launch your game. Good luck with that btw!

Please also consider my older feature request of a simple compressor (or automatic volume limiter) and maybe a simple high-pass filter for a muffled effect on certain in-game sounds.

I'm off to check out v1.1.10 ! -Erich

ryanbarringtoncox commented 10 years ago

Thanks for howler, James!

I second the request for a compressor or a suggested hack to add one. I'm getting some nasty distortion when lots of sounds are happening at once.

ndarilek commented 10 years ago

BTW, I've (finally!) started work on this. You can see my efforts here. At the moment it supports setting source position/velocity, refDistance/maxDistance/rolloffFactor, as well as listener position/velocity, Doppler factor and speed of sound. Direction is not implemented since it only seems relevant to sound cones, which aren't yet implemented. I renamed 'pos' to 'offset' since 'position' is a specific concept in Web Audio and I wanted to keep as close to the spec as possible so people could use it and related materials as their documentation without having to context switch terminology in their heads ("Everything matches up but 'position' is 'location?' Blargh!") Accordingly, 'pos3d' is now 'position'.

Note that, as of now, there has been zero testing of this. I just plugged away at refactoring/adding/reading the spec for the past hour. I hope to test it in my own game shortly and make additional tweaks in preparation for a pull request. It may not work at all, and I'd be surprised if I didn't make a huge syntax error in the cutting/pasting. Just wanted to get what was done out there in case others wanted to look it over and offer feedback.

I'm also fairly new to JS, preferring static languages or JS dialects that make it more concise/clean. Please excuse any errors or cut-and-paste sillinesses.

As an aside, there appears to be a min.js file in the repository, but the repository lacks the mechanism to build this. How about a Gruntfile or something? I'm sure it isn't hard to minify but it'd be nice if there was a standard tooling/set of options so we all got the same output.

erichlof commented 10 years ago

Hi Nolan, Thanks so much for getting this started! Just glancing at it, everything (with naming conventions and all) looks great! There are a few issues though that will pop up if you try and run it I believe, because of the straight cut-and-paste nature of the additions. However, I will comment on them on your forked repository rather than here, so once you get all the wrinkles out, you can hopefully make a PR to James.

But so far it looks great - just what Howler.js needs IMO for 3D game/apps compatibility.

Thanks again Nolan for your work - see you soon on your repository! -Erich

edit: Nolan, could you turn on the Issues feature for your forked Howler repository? I wanted to start a new issue but there is not an option to do that on the webpage. Either that, or we could just create and work out issues here on the original Howler issue tracker page. Just let me know. :)

ndarilek commented 10 years ago

The branch has gone through multiple rebases and edits, and yes, I did catch a few issues. The code now runs, I can say that much for it. :)

At this point I'll likely just add directions and sound cones, since it'd make sense just to push through and finish the implementation. I'll also be integrating it into my game project, and if something doesn't work then I'll commit the change and squash it into the commit history where it'd logically fit.

I'm also going to return pos() and pos3d() as calls to offset() and position() and with logs that they're deprecated. I'll leave it to others to decide when to pull the stubs.

goldfire commented 10 years ago

Thanks all of you for working on this! As soon as I get a chance I'll look through this branch more closely than I have time for at the moment.

goldfire commented 10 years ago

Well this certainly took a lot longer than I ever expected, but the 2.0 branch is now public and ready for testing. There is now an effects plugin that contains these and more features. Here are the details: https://github.com/goldfire/howler.js/issues/193