Closed guest271314 closed 3 years ago
I used OfflineAudioContext
to resample the hard-coded 48000
sample rate and numberOfFrames
, 2568
for firs and 2880
for remainder of t AudioData
objects output by AudioDeocder
// For Opus, we try to encode 60ms, the maximum Opus buffer, for quality
// reasons.
constexpr int kOpusPreferredBufferDurationMs = 60;
// default preferred 48 kHz. If the input sample rate is anything else, we'll
// use 48 kHz.
something like
const TARGET_FRAME_SIZE = 220;
const TARGET_SAMPLE_RATE = 22050;
// ...
const config = {
numberOfChannels: 1,
sampleRate: 22050, // Chrome hardcodes to 48000
codec: 'opus',
bitrate: 16000,
};
encoder.configure(config);
const decoder = new AudioDecoder({
error(e) {
console.error(e);
},
async output(frame) {
++chunk_length;
const { duration, numberOfChannels, numberOfFrames, sampleRate } = frame;
const size = frame.allocationSize({ planeIndex: 0 });
const data = new ArrayBuffer(size);
frame.copyTo(data, { planeIndex: 0 });
const buffer = new AudioBuffer({
length: numberOfFrames,
numberOfChannels,
sampleRate,
});
buffer.getChannelData(0).set(new Float32Array(data));
// https://stackoverflow.com/a/27601521
const oac = new OfflineAudioContext(
buffer.numberOfChannels,
buffer.duration * TARGET_SAMPLE_RATE,
TARGET_SAMPLE_RATE
);
// Play it from the beginning.
const source = new AudioBufferSourceNode(oac, {
buffer,
});
oac.buffer = source;
source.connect(oac.destination);
source.start();
const ab = (await oac.startRendering()).getChannelData(0);
for (let i = 0; i < ab.length; i++) {
if (channelData.length === TARGET_FRAME_SIZE) {
const floats = new Float32Array(
channelData.splice(0, TARGET_FRAME_SIZE)
);
decoderController.enqueue(floats);
}
channelData.push(ab[i]);
}
if (chunk_length === len) {
if (channelData.length) {
const floats = new Float32Array(TARGET_FRAME_SIZE);
floats.set(channelData.splice(0, channelData.length));
decoderController.enqueue(floats);
decoderController.close();
decoderResolve();
}
}
},
});
The audio playback quality is sub-par when resampling from 48000 to 22050. What is the suggested procedure to produce quality audio without glitches, gaps, faster or slower rate frames when converting from WebCodecs AudioData
to AudioBuffer
for the purpose of breaking out of the hard-coded box of Chrome WebCodecs implementation?
The current design direction is to be able to create AudioBuffer
objects directly from typed arrays, and to allow AudioBuffer
to internally used more data types than f32. For now, authors need to create an AudioBuffer
of the same size, use AudioData.copyTo
to copy to an intermediate ArrayBuffer
, and then copy (with possible conversion) to the AudioBuffer
. This is wasteful and not ergonomic.
Another design direction is to be able to get the memory of an AudioData
, and directly construct an AudioBuffer
from this memory, skipping all copies (https://github.com/w3c/webcodecs/issues/287).
There are several issues.
AudioData
at AudioDecoder.output
is absolutely dissimilar from the input AudioData
at AudioEncoder.encode
- developers use codecs for compression, not for implementation restrictions. Might as well just use opusenc
directly or in WASM form is we cannot do opusenc --raw-rate 22050 input.wav output.opus
equivalent in AudioEncoder
configuration - the options are ignored by the implementation. I can use Native Messaging, fetch()
reliably, WebTransport
far less reliably, to input stream from browser and get STDOUT from native application.MediaStreamTrackGenerator
, where the AudioData
output from an OsciallatorNode
connected to MediaStreamAudioDestionaNode
processed with MediaStreamTrackProcessor
is also dissimilar to WebCodecs AudioDecoder.decode()
output at output
callback - with the same input. AudioWorklet
results in less glitches than MediaStreamTrackGenerator
- when the input is processed in a Worker
then WebAssembly.Memory.grow()
is used - because when multiple ReadableStream
s are processing in parallel and piped through and to other streams on the same thread, one can take priority and result in glitches in initial playback until the input is completely read - however, the only way to get an AudioWorklet
instance is via an Ecmascript module, which limits usage due to CSP, and AudioWorklet
does not expose fetch()
or WebTransport
- thus, use single memory with ability to grow; WebAssembly collects garbage -calling MediaStreamTrackGenerator
stop()
can crash the tab.In summary, there needs to be consistency between these burgeoning API's so that user-defined conversion is not necessary, those if the user decides to convert between AudioData
and AudioBuffer
"seamlessly"; WebCodecs has free reign to do whatever it wants - why would the decoder only output 48000 sample rate when I deliberately input 22050 sample rate, 1 channel to configuration? That is inviting user-defined conversion (issues).
I updated and tested the code using OfflineAudioContext
a few hundred more times and compared to creating a WAV file using data from AudioData.copyTo()
// https://github.com/higuma/wav-audio-encoder-js
class WavAudioEncoder {
constructor({ buffers, sampleRate, numberOfChannels }) {
Object.assign(this, {
buffers,
sampleRate,
numberOfChannels,
numberOfSamples: 0,
dataViews: [],
});
}
setString(view, offset, str) {
const len = str.length;
for (let i = 0; i < len; i++) {
view.setUint8(offset + i, str.charCodeAt(i));
}
}
async encode() {
const [{ length }] = this.buffers;
const data = new DataView(
new ArrayBuffer(length * this.numberOfChannels * 2)
);
let offset = 0;
for (let i = 0; i < length; i++) {
for (let ch = 0; ch < this.numberOfChannels; ch++) {
let x = this.buffers[ch][i] * 0x7fff;
data.setInt16(
offset,
x < 0 ? Math.max(x, -0x8000) : Math.min(x, 0x7fff),
true
);
offset += 2;
}
}
this.dataViews.push(data);
this.numberOfSamples += length;
const dataSize = this.numberOfChannels * this.numberOfSamples * 2;
const view = new DataView(new ArrayBuffer(44));
this.setString(view, 0, 'RIFF');
view.setUint32(4, 36 + dataSize, true);
this.setString(view, 8, 'WAVE');
this.setString(view, 12, 'fmt ');
view.setUint32(16, 16, true);
view.setUint16(20, 1, true);
view.setUint16(22, this.numberOfChannels, true);
view.setUint32(24, this.sampleRate, true);
view.setUint32(28, this.sampleRate * 4, true);
view.setUint16(32, this.numberOfChannels * 2, true);
view.setUint16(34, 16, true);
this.setString(view, 36, 'data');
view.setUint32(40, dataSize, true);
this.dataViews.unshift(view);
return new Blob(this.dataViews, { type: 'audio/wav' }).arrayBuffer();
}
}
// ...
const wav = new WavAudioEncoder({
sampleRate: 48000,
numberOfChannels: 1,
buffers: [new Float32Array(data)],
});
const ab = (await ac.decodeAudioData(await wav.encode())).getChannelData(0);
Glitches can occasionally occur in the beginning of the OfflineAudioContext
playback. No glitches occur creating WAV headers and prepending the headers to the data. Test and compare the differences for yourself https://guest271314.github.io/webcodecs/.
Are these the simplest approaches to resample the output from AudioDecoder.decode()
?
The important point is that it is only necessary to resample the data from AudioData
at AudioDecoder.output
becuase WebCodecs does not honor AudioEncoder
or AudioDecoder
configuration and resamples to 48000
, and outputs numberOfFrames
far greater than input numberOfFrames
which is inconsistent behaviour.
If there was consistency between WebCodecs AudioEncoder.output
and AudioDecoder.output
with regard to AudioData
there would be no need to resample with Web Audio API.
Two things:
This approach to resample segments of audio with an OfflineAudioContext
cannot work. Non-naive audio resampling is a stateful operation, and creating a new OfflineAudioContext
each time doesn't allow keeping any state. Resampling using an OfflineAudioContext
only works if the entirety of the audio is resampled in one operation.
Resampling to another rate is not in the scope of Web Codecs. Web Codecs is just about decoding and encoding, and resampling the audio to play it out is expected for now, since there is no resampler object in the Web Platform yet. Opus always works in 48kHz internally, and by default always decodes to 48kz, so this is what you see in Web Codecs. For other codecs, you'll see that the rate is (usually) the rate of the input stream.
The problem is resampling is necessary based on WebCodecs output.
All you need do is test the output of AudioDecoder
and try to pass that AudioData
directly to a MediaStreamTrackGenerator
. One of two outcomes currently exist without user-defined intervention:
AudioDecoder
=> MediaStreamTrackGenerator
.I can do $ opusenc --raw-rate 22050 input.wav output.opus
and get the output I set. WebCodecs ignores the configuration, yet claims "flexibility". Since you are citing 48kHz as the inflexible default for WebCodecs implementation of 'opus' you need to update your specification to state that unambiguously so that I no longer will expect the option I pass to be effectual.
Resampling is necessary with the output of WebCodecs AudioDecoder
AudioData
to outerh API's - without using setTimeout()
and essentially guessing when the incompatible-with AudioData
will end.
I suggest you folks actually test AudioDecoder
=> MediaStreamTrackGenerator
, and stop claiming WebCodecs is "flexible" is you intend on restricting options available using opusenc
and opusdec
. I might as well just use opusenc
and opusdec
with fetch()
or WebTransport
.
Describe the feature WebCodecs defines
AudioData
. In the WebCodecs specification this note appears:However, the
format
ofAudioData
fromAudioDecoder
is'f32'
not'f32-planar'
.Even though
sampleRate
set atAudioDecoder
configurationsampleRate
is other than48000
(andopusenc
supports--raw-rate
option to specifically set sample rate for Opus encoded audio) the resulting WebCodecsAudioData
instance always hassampleRate
set to48000
.The effective result is that there is no way that I am aware of to convert the data from
AudioData.copyTo(ArrayBuffer, {planeIndex: 0})
to anAudioBuffer
instance that can be played withAudioBufferSourceNode
or resampled to a differentsampleRate
, for example,22050
.Since
MediaStreamTrackGenerator
suffers from "overflow" and no algorithm exists in the WebCodecs specification to handle the overflow outside of one defined by the user it is necessary for the user to write the algorithm. After testing a user might find a magic number to delay the next call toMediaStreamTrackGenerator.writable.WritableStreamDefaultWriter.write()
https://plnkr.co/edit/clbdVbhaRhCKWmPS that approach does not achieve the same result when attempting to use a Web Audio APIAudioBuffer
andAudioSourceNode
verifying the
AudioData
data andAudioBuffer
channel data are incompatible.Is there a prototype? No.
Describe the feature in more detail
Web Audio API
AudioBuffer
<=> WebCodecsAudioData
Provide an algorithm and method to convert WebCodecs
AudioData
to Web Audio APIAudioBuffer
with option to set sample rate of the resulting object.