WebAudio / web-audio-api

The Web Audio API v1.0, developed by the W3C Audio WG
https://webaudio.github.io/web-audio-api/
Other
1.04k stars 166 forks source link

Specify static Map of audio headers that can be prepended to Uint8Array for decodeAudioData() #2135

Closed guest271314 closed 4 years ago

guest271314 commented 4 years ago

Describe the feature Briefly describe the feature you would like WebAudio to have.

decodeAudioData() does not decode audio data lacking a header.

Is there a prototype? If you have a prototype (possibly using an AudioWorkletNode), provide links to illustrate this addition. This is the best way to propose a new feature.

No prototype. There is a design pattern.

Describe the feature in more detail Provide more information how it does and how it works.

The basic algorithm https://github.com/WebAudio/web-audio-api/issues/337#issuecomment-60258749

I'm a bit late to this one but I can't see how the decoding of partial content could be done automatically with decodeAudioData(). File formats such as OGG Vorbis (and MP3 if I remember correctly) store the information required to decode the audio in the file header. The decoder could store that information somewhere but it would have no idea if N number of calls to decodeAudioData() were sequential and/or partial chunks from the same source file. Also, the structure of the source file needs to be considered, some formats can only be decoded in chunks (frames) of specific sizes.

It's actually possible to "stream" audio now using decodeAudioData() if the programmer understands the format of a particular audio file. I managed to do this with OGG Vorbis (load and decode it in chunks) a while ago by storing the required decoding information and prepending it to the beginning of each chunk prior to decoding. The data loading and splicing can be done in a worker and then transferred (zero-op) to the main thread for decoding.

Basic design pattern

const res = [];
// static Map of codec headers
const opusHeader = AudioContext.headers.get("audio/ogg;codecs=opus");
async function processStream({value, done}) {
  if (done) {
    return this.closed;
  }
  for (let i = 0; i < value.length; i++) {
    res[res.length] = value[i];
  }
  // wait for header and enough audio for 1 second playback
  if (res.length > 43000) {
     const init = await decodeAudioData(new Uint8Array(res.splice(0, res.length)).buffer);
     audioWorklet.port.postMessage({init}, [init.getChannelData(0)]);
  } else {
     const audioDataWithoutHeader = res.splice(0, res.length);
     audioDataWithoutHeader.unshift(opusHeader); // add header
     const next = await decodeAudioData(new Uint8Array(audioDataWithoutHeader).buffer);
     audioWorklet.port.postMessage({next}, [next.getChannelData(0)]);
  }
}
fetch("/path/to/oggContainerOpusCodec")
.then(response => response.getReader())
.then(reader => {
    reader.read.call(reader).then(processStream.bind(reader))
})
.catch(console.error);

This will allow streaming any audio without using MediaSource extensions or HTMLMediaElement.

guest271314 commented 4 years ago

This code plays, with gaps between data being written to Uint8Array, until tab memory usage freezes the tab, or the OS freezes.

There needs to be a means to flush memory in this case.

<!DOCTYPE html>
<html>

<head>
  <title>AudioContext stream</title>
</head>

<body>
  <audio controls autoplay preload="auto"></audio>
  <script>
    const ac = new AudioContext({
      numberOfChannels: 2,
      sampleRate: 48000
    });
    const msd = ac.createMediaStreamDestination();
    const {
      stream
    } = msd;
    const [track] = stream.getAudioTracks();
    track.enabled = false;
    const gainNode = new GainNode(ac, {
      gain: 1
    });
    gainNode.connect(msd);

    gainNode.connect(ac.destination);
    const audio = document.querySelector("audio");

    let init = false;
    let durationOffset = 0;
    let n = 0;
    let contentLength = 0;
    let len = 0;
    let arrayBuffer;
    let audioData;
    let ab;

    async function processStream({
      value, done
    }) {
      try {

        if (done) {
          return this.close;
        }

        for (let i = 0; i < value.byteLength; i++, n++) {
          audioData[n] = value[i];
        }
        const sub = audioData.slice(0, n);
        console.log(sub);
        ab = await ac.decodeAudioData(sub.buffer).catch(_ => void 0);
        if (ab) {
          if (ab.length > len && ab.duration > durationOffset) {
            console.log(durationOffset);
            const source = ac.createBufferSource();
            source.buffer = ab;
            source.connect(msd);
            source.start(ac.currentTime, durationOffset);
            if (audio.srcObject !== stream) {
              audio.srcObject = stream;
              track.enabled = true;
            }
            await new Promise(resolve => {
              source.onended = e => {
                source.disconnect();
                durationOffset = ab.duration;
                len = ab.length;
                console.log(ab, sub);
                resolve();
              }
            })

          }

          return this.read().then(processStream.bind(this)).catch(e => {
            throw e
          })

        }
        return this.read().then(processStream.bind(this)).catch(e => {
          throw e
        })
      } catch (e) {
        console.trace(e);
      }
    }
    fetch("https://fetch-stream-audio.anthum.com/72kbps/opus/house--64kbs.opus?cacheBust=1")
      .then(response => {
        contentLength = response.headers.get("content-length");
        arrayBuffer = new ArrayBuffer(contentLength);
        audioData = new Uint8Array(arrayBuffer);
        return response.body.getReader()
      })
      .then(reader => {
        reader.read.call(reader).then(processStream.bind(reader))
      })
  </script>
</body>

</html>
padenot commented 4 years ago

This is what we're doing in Web Codecs.

guest271314 commented 4 years ago

That response has been given several times for closure of issues. Unless missing critical documentation nothing in WebCodecs concept is clearly specified or shipped. Does you answer mean decoeAudioData() is a relic that will not be improved in lieu of an API that does not yet exist?

padenot commented 4 years ago

As noted a bit everywhere (in closed and open issues on this repo and the v2 repo, and public minutes of calls):

rtoy commented 4 years ago

Let me also add that we will work closely with the WebCodecs group to make sure that it can provide solutions to the problems people have had with decodeAudioData. Ideally, WebCodecs should be able to do everything that decodeAudioData does and more.

guest271314 commented 4 years ago

we're not taking any features,

Then why is that "Feature request" template here, in this repository? Kindly remove that from your issues template to avoid confusion.

and we've decided against fixing them by piling hacks on top of it

That is all you had to say in the first place. Broke, won't fix.

Ideally, WebCodecs should be able to do everything that decodeAudioData does and more.

Well, again, WebCodecs is not operational. How are users supposed to know that Web Audio API authors are deferring to a, as-yet non-existent API that is still in the fledgling stages of development?

In any event am banned from the WICG organization for 1,000 years for some fraudulent, hypocritical reason they and their parent org. W3C concocted.

Kindly ping this closed issue when what you are tentatively referring to being possible in the future re decoding partial, non-header preceded audio, by way of WebCodecs is actually specified and deployed in the field.

guest271314 commented 4 years ago

Was just trying to see if could get the code working with existing Web Audio API technology. Since am banned from WICG this issue https://github.com/WICG/web-codecs/issues/28 should suffice for remedying the issue with decodeAudioData() and use of WebCodecs to do so. Note that decodeAudioData() is not the only way the above code can yield the expected result. That is why asked the authors of this Web Audio API precisely how to solve the use case, by any means.

When that issue is closed as fixed by a PR what has been referred to above by you will be the fruition of that reference. Look forward to that achievement. Thanks for your efforts.