WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.06k stars 168 forks source link

Exposing a playbackPosition property on AudioBufferSourceNode. #2397

Open notthetup opened 10 years ago

notthetup commented 10 years ago

AudioBufferSourceNode lacks a playbackPosition property which exposes the sample index which is currently being played, or the last sample index played if the AudioBufferSourceNode is not playing. The property can be readonly, and to keep inline with the rest of the API it can be a time value not an index.

If the playbackRate doesn't change the current playback position can be easily calculated using AudioContext's currentTime property. But with Parameter automation on playbackRate the calculations can get pretty gnarly and inaccurate.

This property can come in handy when trying to implement a few types of scenarios

  1. A pause/resume() functionality to hold/unhold the playback of a specific source, while other sources keep playing.
  2. Synchronizing multiple audio buffers based on user interaction. For example starting a buffer a the exact sample position as another one stopped (based on user interaction).

Currently, it seems, based on a conversation in the public-audio mailing list, the only way to generate this property is to capture all changes to to playbackRate parameter and calculate the playbackPosition based on that. This is complicated and redundant since the playbackPosition property is being stored internally in the WebAudioAPI, and recalculating it in Javascript is a waste of effort.

colinsullivan commented 10 years ago

Also, the onended callback of the AudioBufferSourceNode could be helpful in the second use case you mentioned.

It seems to me that for the web audio platform it is unreasonable to expect sample-accurate events based on user interaction.

cwilso commented 10 years ago

Confused: you mention a suggestion and the very reason it's not tenable. :) the 'onended' callback of AudioBufferSourceNode can't be used to synchronize, because it's NOT sample-accurate; no JS event can be, because the window for a single sample is on the order of hundredths of a millisecond; JS event delivery is on the order of a few milliseconds, IF garbage collection or other main thread stuff (layout, JS execution, etc) don't get in the way. You need to schedule ahead.

playbackPosition will have to expose the sample position of the buffer for the next scheduling - that is, the next bit that WILL be scheduled. That will let you do the appropriate math.

karlt commented 10 years ago

If we need a playbackPosition, then the onended event would be a good place to put it. That is sufficient for starting playback of a buffer from a point where it was previously stopped, if you are happy to have a gap between stop and restart.

I'm not so keen to keep playbackPosition always up to date on the AudioBufferSourceNode. Could scenario 2 in comment 28485063 be addressed by playing all the buffers in sync and using GainNodes to switch in response to user events?

notthetup commented 10 years ago

Thanks everyone for your feedback.

@karlt Yup. That would work as long as all the AudioBufferSourceNode are kept running, and a GainNode was used to switch between the individual sources. But if looping is enabled, then this only works if all sources files are of exact same length, which might not be always the case.

The pause (scenario 1) or some variant of that, where one AudioBufferSourceNode is stopped and another is started, is definitely the where knowing the last playbackPosition becomes critical.

As for the sample accuracy argument, sometimes its enough to get an "sample inaccurate" playbackPosition value. A rough estimate to figure out what's being played. Currently if ParameterAutomation is being used, the only way to guess even an inaccurate position is to basically to capture and process all the Automation Events in JS.

Lastly, exposing the playbackPosition in onended may work. But for one of use cases I'm looking at, I need to count the number of loops a AudioBufferSourceNode has completed. Sampling playbackPosition would allow me to check how many loops AudioBufferSourceNode had completed, even with a changing playbackRate.

cwilso commented 10 years ago

I see a (relatively narrow) use case for the onended return of playbackPosition - but for the scenario I badly needed it, that wouldn't work at all. I have to track the current playback time in wubwubwub (http://webaudiodemos.appspot.com/wubwubwub/ - press the power button, wait a few seconds, press it again, lather, rinse and repeat); I can't wait until it's ended, because it has to smoothly modify, but the playback position controls the deck visuals. (Easier to see if you have a DJ controller attached and you scrub.)

karlt commented 10 years ago

I see, thanks Chris. Your demos/examples are very helpful.

Implementations could actually do AudioBufferSourceNode.playbackPosition reasonably efficiently in the common cases, if necessary, by sending updates from the processing thread to the AudioContext thread only when the playbackRate computedValue has changed. The AudioContext thread can then use the last sync time and rate computedValue to calculate from currentTime until the next update.

cwilso commented 10 years ago

Yes. The more difficult part is when the playbackRate AudioParam has complex (e.g. setTargetAtTime/setExponentialRampAtTime) scheduling going on. My wubwubwub demo does the calculations between updates, basically the way you suggest, and the math is only moderately complex for the linear ramps I use on power up/down; the exponential ones get harder. (Area under a curve, rather than area under a line, basically.)

notthetup commented 10 years ago

@cwilso Yup. I have a working version of the same in my code now. I also have to keep a list of ParameterAutomation events (had to look up Chromium source here) so that I can figure out things like to stop calculating for setTargetAtTime etc. It works, but it's not heavily accurate. If I can convince some people, I will open source it at some point.

carlosgmartin commented 10 years ago

Is there any indication as to when this feature could be implemented in the Web Audio API?

cwilso commented 10 years ago

No, there is not.

To be clear: this feature would be: "Expose a floating-point playbackPosition on BufferSourceNode. This will represent where in the buffer the next playback block is coming from, in terms of seconds. It should be cautioned that it is dangerous to expect this will be useful for sample-accurate scheduling, as rounding errors and thread interactions may cause disruption."

sebpiq commented 9 years ago

I would really love to see this as well, but I'd love it to be an AudioParam. The use case is reading sound in a very complex way : going back, forth, jumping around and so on. In fact if such a feature existed, it would cover all the other features already existing on the node (playbackRate, loop), and provide additional functionality that is not available at the moment (jumps).

karlt commented 9 years ago

On Thu, 29 Jan 2015 01:03:04 -0800, Sebastien Piquemal wrote:

I would really love to see this as well, but I'd love it to be an AudioParam. The use case is reading sound in a very complex way : going back, forth, jumping around and so on. In fact if such a feature existed, it would cover all the other features already existing on the node (playbackRate, loop), and provide additional functionality that is not available at the moment (jumps).

That sounds very similar to WaveShaperNode.

sebpiq commented 9 years ago

Hmmm ... There is probably a misunderstanding, because I really don't see the similarity .. Could you explain?

On Tue, Feb 3, 2015 at 11:03 PM, Karl Tomlinson notifications@github.com wrote:

On Thu, 29 Jan 2015 01:03:04 -0800, Sebastien Piquemal wrote:

I would really love to see this as well, but I'd love it to be an AudioParam. The use case is reading sound in a very complex way : going back, forth, jumping around and so on. In fact if such a feature existed, it would cover all the other features already existing on the node (playbackRate, loop), and provide additional functionality that is not available at the moment (jumps).

That sounds very similar to WaveShaperNode.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/296#issuecomment-72744969 .

Sébastien Piquemal

-----* @sebpiq* ----- http://github.com/sebpiq ----- http://funktion.fm

karlt commented 9 years ago

Hmmm ... There is probably a misunderstanding, because I really don't see the similarity .. Could you explain?

In each case, an input describes which part of a buffer is produced on the output of the node. With playbackPosition on AudioBufferSourceNode, the input would be an AudioParam. With WaveShaperNode, the input is from the output of another AudioNode. A GainNode with constant input from another source can be used to convert an AudioParam to AudioNode output.

cwilso commented 9 years ago

The point of this issue was to expose a read-only position, not to enable scrubbing through a buffer with an AudioParam.

On Tue, Feb 3, 2015 at 5:25 PM, Karl Tomlinson notifications@github.com wrote:

Hmmm ... There is probably a misunderstanding, because I really don't see the similarity .. Could you explain?

In each case, an input describes which part of a buffer is produced on the output of the node. With playbackPosition on AudioBufferSourceNode, the input would be an AudioParam. With WaveShaperNode, the input is from the output of another AudioNode. A GainNode with constant input from another source can be used to convert an AudioParam to AudioNode output.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/296#issuecomment-72772982 .

sebpiq commented 9 years ago

Ow yeah ... apologies Karl, you're right. Looks like it could be used for this. Though it is obviously not what WaveShaper is intended for.

Chris, I just open an issue for that ;) ?

On Wed, Feb 4, 2015 at 2:28 AM, Chris Wilson notifications@github.com wrote:

The point of this issue was to expose a read-only position, not to enable scrubbing through a buffer with an AudioParam.

On Tue, Feb 3, 2015 at 5:25 PM, Karl Tomlinson notifications@github.com wrote:

Hmmm ... There is probably a misunderstanding, because I really don't see the similarity .. Could you explain?

In each case, an input describes which part of a buffer is produced on the output of the node. With playbackPosition on AudioBufferSourceNode, the input would be an AudioParam. With WaveShaperNode, the input is from the output of another AudioNode. A GainNode with constant input from another source can be used to convert an AudioParam to AudioNode output.

— Reply to this email directly or view it on GitHub < https://github.com/WebAudio/web-audio-api/issues/296#issuecomment-72772982

.

— Reply to this email directly or view it on GitHub https://github.com/WebAudio/web-audio-api/issues/296#issuecomment-72773266 .

Sébastien Piquemal

-----* @sebpiq* ----- http://github.com/sebpiq ----- http://funktion.fm

NHQ commented 9 years ago

YES PLEASE NO QUESTION

AshleyScirra commented 9 years ago

Users of the Construct 2 game engine need this. It is very difficult to implement a simple pause and resume without a Web Audio API provided playback position (bikeshedding: I'd prefer the name playbackTime). When playback is paused we need at least a reasonably accurate (not necessarily sample accurate, but close-enough) playback time to pass as the offset to the next start() when resuming. JS timers are not synchronised to the audio clock so I'd expect them to drift apart even if we tried to track this ourselves, especially with looping playback, which was actually the use case which sent me looking for this.

I think the fact there is an onended event is admission enough that it is not adequate to track the playback time with JS. We could just fire the ended event ourselves at the time (currentTime + duration), but obviously this does not work if the playbackRate changes. So I'm a little surprised this is not already in place. Pausing and resuming is a pretty basic use case, and it should be easy to implement this.

FWIW, changing the playbackRate is an interesting feature for games - it's good for accelerating engine sounds, time scaling (i.e. slo-mo) effects, varying the pitch of environmental sounds like footsteps to make them sound less repetitive, and more.

notthetup commented 9 years ago

@AshleyScirra If you need a temporary work around for this, I use a really ugly hack which kinda works.

Basically one has to create a second BufferSource where the samples are just counts [1, 2, 3, 4] which is then connected to a ScriptProcessor which keeps stores the last value from every input buffer. The play and pause calls and also the change of playbackRate have to be forwarded to this counter buffer as well. (Maybe these slides explain it better.. http://chinpen.net/talks/wac-paper/#/37 ).

The playPosition is only accurate to block size (128 samples), but it's better than nothing. Also it means having to use the ScriptProcessor quite a bit, which has it's own down sides.

AshleyScirra commented 9 years ago

@notthetup - hopefully the ugliness of that hack is motivation for the spec to officially include this :P

cwilso commented 9 years ago

Note the milestones - Joe moved this to v.next.

dorontal commented 8 years ago

That's great news that it's in the milestones. I'd like to join the chorus and reiterate that pause/resume cannot currently be implemented accurately without position exposed. So a simple audio player (similar to, for example, the <audio> tag built-in player) cannot currently be implemented with web audio!

That said, I currently use a workaround different from the one above, but it is not extremely accurate and its accuracy gets worse the more pauses you make. The basic idea is to (a) refer to AudioContext's currentTime property for time (do not use javascript time for audio); (b) as soon as pause (disconnect) is called, use currentTime to measure lastPauseTime then when you resume, keep track of totalPauseTime with this totalPauseTime = audioContext.currentTime-lastPauseTime. Your playback time is then audioContext.currentTime-totalPauseTime. This is not 100% accurate and barely tolerable because there's a slight delay between the disconnect() call to pause and the measurement of time via currentTime and if the VM decides to garbage collect between those two statements there would be a huge delay...

Found a decent example (not mine) of this method at this Codepen

notthetup commented 8 years ago

@dorontal Your technique is great, but it it assumes that playbackRate is on the AudioBufferSourceNode is unchanged from the default 1. If that parameter is changed, or worse automated, then the calculations get a lot more hairy.

dorontal commented 8 years ago

I agree. Thanks for pointing that out!

cherston commented 8 years ago

Joining the chorus of people who would very much welcome this feature, both for the pause/resume scenario as well as the synchronization of multiple buffers scenario.

I think the workaround that I will use to accommodate @notthetup's response to @dorontal will be to keep track of the playbackRate changes as well.

notthetup commented 8 years ago

@cherston Yes. And that soon get very very tedious when you have to factor in parameter automation. Very soon I felt like I was reimplementing AudioParam in JS :(

But with the recent change in the API to support playbackRate from [-Inf, Inf] one the most common usecase I had for this request (reverse playback) is already supported on AudioBufferSourceNode.

jakearchibald commented 8 years ago

Another use-case: Say I have a looping track, and the user loads in an additional looping track (of the same length), once that track loads I want to start playing it in addition to the current track, in time with the current track.

Is the currentTime of the context accurate for this? Eg if I do:

const loopStartTime = context.currentTime;
loop1Source.connect(context.destination);
loop1Source.loop = true;
loop1Source.start(0);

// then seconds later…
loop2Source.connect(context.destination);
loop2Source.loop = true;
loop2Source.start(0, (context.currentTime - loopStartTime) % loop1Source.buffer.duration);

…will the loops be playing exactly in time with one-another?

cwilso commented 8 years ago

"Sort of", yes. The problem is that we don't have a way to ENSURE that the audio thread has processed a block in the time between when you get the context.currentTime and the time the start() executes. This means it's always possible you could miss the next actual processing block if start(0) is called.

The solution is that you shouldn't hardly ever call start(0) for anything you want to synchronize. It's best to schedule ahead by the size of one processing "batch", in case the audio thread processes while your code is running in the main thread - you shouldn't presume that "currentTime" is precisely when you can start. That batch is at least a sample block (128 samples) - more on slow systems. The size of one processing chunk is exposed in the spec as "context.baseLatency", but not implemented yet (in Chrome at least).

This should work:

var batchTime = context.baseLatency || (128 / context.sampleRate);
const loopStartTime = context.currentTime;
loop1Source.connect(context.destination);
loop1Source.loop = true;
loop1Source.start( loopStartTime + batchTime);

// then seconds later…
loop2Source.connect(context.destination);
loop2Source.loop = true;
var now = context.currentTime;
loop2Source.start(now + batchTime, (now + batchTime - loopStartTime) % loop1Source.buffer.duration);

Note that usually, you would want to align on beats anyway, so you wouldn't immediately start - or you'd start both samples playing and just control their volumes through gain nodes.

cwilso commented 8 years ago

Forgot to say - exposing currentPlaybackTime would not change this scenario in the least - the problem is not in doing that math, it's in "when can I actually get something to start playing".

jakearchibald commented 8 years ago

@cwilso cheers! Given that synchronisation is more important than immediacy in this case, would using a second as the base latency produce a more reliable result?

cwilso commented 8 years ago

Well, yeah, but a second is a long time. :) I think you'd be fairly safe across even really bad device hosts with maybe a quarter of a second? With some quick napkin calculations, we haven't ever seen situations with worse than about 150ms, so even that's overkill.

jakearchibald commented 8 years ago

@cwilso ta!

DracotMolver commented 7 years ago

Maybe this is the place where I can get some info. I'm looking for the same thing, time elapsed. What I'm doing in my app is recreating the time that is elapsing while a song is playing. So they way I do this is like this:

const audioContext = new window.AudioContext(); // Object AudioContext

const xhtr = new XMLHttpRequest(); // Object XMLHttpRequest 
xhtr.open('GET', [song_file.mp3], true);
    xhtr.responseType = 'arraybuffer';
    xhtr.onload = () => {
      audioContext.decodeAudioData(xhtr.response).then(buffer => {
        source = audioContext.createBufferSource();

        // The buffer gives us the song's duration.
        // The duration is in seconds, therefore we need to convert it to minutes
       // Eg.   6:20 min is the time of a song
        time = ((duration = buffer.duration) / 60).toString();
        minute = parseInt(time.slice(0, time.lastIndexOf('.')));
        second = Math.floor(time.slice(time.lastIndexOf('.')) * 60);

        source.onended = () =>  {

        };
        // I'm using some filters... don't pay attention to them
        source.buffer = buffer;
        source.connect(filter[0]);
        filter.reduce((p, c) => p.connect(c)).connect(audioContext.destination);

        startTimer(); // This function will loop to get the time elapsed
        source.start(0);
      }, reason => {
        // Something happend
      });
    }
    xhtr.send(null);

Here I do the calculation of every second of the song while is playing. I use the requestAnimationFrame because I can get a 60fps so, it's pretty closer to a perfect time elapsed. But the bad thing about requestAnimationFrame is when you minimize or change the tab of your browser, because it pass to a second plane and it stops and then start again when you are focused in the tab. This is because javascript works with a single thread in the browsers (as far as I know).

function startTimer() {
  const TIME_ITER = () => {
    // do something here like calculate the time
    interval = requestAnimationFrame(TIME_ITER);
  };
  interval = requestAnimationFrame(TIME_ITER);
}

I'd like to have an updateTimeElapsed() function from the AudioBufferSourceNode interfaces which is implemented by the createBufferSource.

To jump from eg: 5:00 to 5:10, I have to stop the AudioBufferNode which kills everything and I have to do gain part of the code above. this is could be solve if the resume() function of the AudioContext, would accept a Double value which indicates from where the song is resumed; just like the start() function does in the AudioBufferNode.

rtoy commented 7 years ago

Even if AudioContext.resume accepted a value for when an ABSN would resume from, this won't really work in general. If the source is followed by a long filter or convolver, resuming the source at a new time will have a funny effect because the filter or convolver memory will have inputs from the original data mixed in with the new ABSN data. It won't be a smooth transition.

In any case, this issue is marked v.next, so it's highly anything will change soon. However, I don't think it would be hard to implement a playbackPosition. It won't be sample-accurate; very likely it will be some value in the past, but close to the current time. Specifying it is another question altogether.

nektro commented 7 years ago

Adding a +1 I would love to see this property added
edit: to clarify I support playbackPosition being added to onended

nuthinking commented 7 years ago

+1

mdjp commented 7 years ago

It would be good to get a concrete use case for this issue.

AshleyScirra commented 7 years ago

There are some already in this thread. For example, pausing and resuming when varying the playback rate, requires an audio-synchronized playback time to know where to resume from.

OrbitalJeffL commented 6 years ago

I have a pretty solid use case. I'm using Web audio API to synchronize video and audio content. Becuase playback start time cannot be guaranteed or because of video playback stuttering, I need to periodically synchronize my video with my current audio position to a high degree of accuracy. Currently, I try to access the playback position of the audio to set the video to the same state but because of the inaccuracy involved in context.currentTime - loopStartTime I'm off by several frames. An accurate track playback position seems a very useful and reasonable feature.

haywirez commented 6 years ago

This is absolutely essential - think about a DJ-style player where you have varying playback speeds. Can't believe this wasn't thought of earlier, and it needs to be sample accurate.

rtoy commented 6 years ago

Since the audio runs on a separate thread, and you're querying the value from the main thread, how can this really work and produce sample-accurate values? At best, I think you would get the position of the last render quantum, so the returned value is always in the past.

hoch commented 6 years ago

This is where the AudioWorkletGlobalScope's currentFrame can be useful. It gives you the actual exact frame index in the same thread.

mr21 commented 6 years ago

Why for a DJ app, having the value with sample-unit would be better than in second-unit? If webaudio handle the audio, the UI side doesn't need to be so precise.

haywirez commented 6 years ago

@Mr21 ideally you want totally precise cueing and slow scrubbing to hear whats going on. Think what would it take to implement something like Traktor in the browser.

robianmcd commented 6 years ago

@haywirez I built a POC similar Traktor that lets you scrub with Torq control vinyl (at least that was the idea) https://robianmcd.github.io/open-dvs/. To do it I had to create a ScriptProcessorNode that keeps track of the speed, direction, and current offset in a song and manually picks out the frames of a song to play for the next buffer. https://github.com/robianmcd/open-dvs/blob/caae0842ae53ce4195b8f0e17d6fe36e51e70bfe/src/services/activeSong.ts#L125-L171

Definitely not an ideal/easy solution but it is possible.

mr21 commented 6 years ago

@haywirez yes i am thinking about something like this, but if the audio is already handled why should the vertical audio cursor should be more precise than something like 7.84645578 second?

selimachour commented 5 years ago

Hi guys. I just needed to draw a playback cursor over a waveform on a canvas. I'm building player with AB looping and playbackRate controls for rehearsing on drums.

Here's a solution that worked for me :

  1. When I createBufferSource for loading my song, I also create a second bufferSource (not used for playback) which has the same number of samples as the song and which I fill all the samples linearly from -1 to 1.

  2. I then connect this second positionBuffer (I call it) to a scriptProcessor with an onaudioprocess that computes the samplePosition by simple interpolation of the first sample's value in the processBuffer from -1..1 TO 0 .. nbSamples. (apparently you do have to connect processor to the destination otherwise it won't be run.

  3. All play actions, loop changes, resume, sampleRate changes are then, always done on both bufferSources, the song's and the position's.

That's it. So far so good.

mdjp commented 5 years ago

We cannot provide a sample accurate attribute. This could potentially be achieved using the audioWorlket. However we see the use for a non sample accurate playback position on the main thread and will consider this for V2.

padenot commented 4 years ago

This also provides another way to have a very accurate clock, this needs to be considered. This should be updated as often as AudioContext.currentTime, so the clock resolution is not that problematic.

padenot commented 4 years ago

All that's left is to decide if it's an attribute or a method, and the name.