Closed mreinstein closed 2 years ago
lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits. integration into apis are welcome as pull requests
lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits
@zhuker yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output.
integration into apis are welcome as pull requests
I started working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works!
Input: You can easily convert float32 to int16 by multiplying each member of the array by 32767
Output: Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense
I am not very familiar with audioworklet but lamejs should be the terminating node in an audio pipeline. Shouldn't it?
On Wed, Dec 13, 2017, 20:12 Mike Reinstein notifications@github.com wrote:
lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits
@zhuker https://github.com/zhuker yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output.
integration into apis are welcome as pull requests I startd working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works!
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/zhuker/lamejs/issues/48#issuecomment-351603669, or mute the thread https://github.com/notifications/unsubscribe-auth/AEKlQn8IRQR0XhhRjgUf29-hC72wlmOxks5tAKA2gaJpZM4RBXpc .
lamejs should be the terminating node in an audio pipeline. Shouldn't it?
The audio pipeline in my use case:
┌---------------------┐
┌-->|watson speech-to-text|
┌---┐ ┌------┐ | └---------------------┘
|mic├--->|lamejs├-----┤
└---┘ └------┘ | ┌---------------------------┐
└-->|indexeddb (browser storage)|
└---------------------------┘
Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense
Yeah, this is what I'm struggling with. Unless I'm mistaken, as per https://webaudio.github.io/web-audio-api/#defining-a-valid-audioworkletprocessor AudioWorklet outputs are Float32Arrays. I'm trying to figure out how to package this in a sensible way as an AudioWorklet so that lamejs can be used a normal webaudio node.
Did you solve this?
That was 5 years ago, I'm not working on audio processing lately.
Audio Worklet support has gotten pretty decent now though. It should be pretty feasible in theory.
I achieved the requirement using the details here https://github.com/zhuker/lamejs/pull/81/commits/e18447fefc4b581e33a89bd6a51a4fbf1b3e1660.
Is the issue resolved?
I guess I can take a look and see if i can make that work via audio worklet
This is what I am doing with raw PCM input that I simultaneously stream with MediaStreamTrackGenerator
and record with lamejs which I modified to be a Module export
, in pertinent part
async importEncoder() {
if (this.mimeType.includes('mp3')) {
const lamejs = (await import('./lame.min.js')).default;
this.mp3encoder = new lamejs.Mp3Encoder(2, 44100, 128);
this.mp3Data = [];
} else if (this.mimeType.includes('opus')) {
const { Decoder, Encoder, tools, Reader, injectMetadata } = (await import('./ts-ebml.min.js'));
Object.assign(this, { Decoder, Encoder, tools, Reader, injectMetadata });
}
}
const int8 = new Int8Array(441 * 4);
const { value, done } = await this.inputReader.read();
// value: raw PCM from parec -d @DEFAULT_MONITOR@
if (!done) int8.set(new Int8Array(value));
const int16 = new Int16Array(int8.buffer);
// https://stackoverflow.com/a/35248852
const channels = [new Float32Array(441), new Float32Array(441)];
for (let i = 0, j = 0, n = 1; i < int16.length; i++) {
const int = int16[i];
// If the high bit is on, then it is a negative number, and actually counts backwards.
const float = int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
// deinterleave
channels[(n = ++n % 2)][!n ? j++ : j - 1] = float;
}
// var floatSamples = new Float32Array(44100); // Float sample from an external source
const left = channels.shift();
const right = channels.shift();
let leftChannel, rightChannel;
if (this.mimeType.includes('mp3')) {
const sampleBlockSize = 441;
leftChannel = new Int32Array(left.length);
rightChannel = new Int32Array(right.length);
for (let i = 0; i < left.length; i++) {
leftChannel[i] = left[i] < 0 ? left[i] * 32768 : left[i] * 32767;
rightChannel[i] = right[i] < 0 ? right[i] * 32768 : right[i] * 32767;
}
}
const data = new Float32Array(882);
data.set(left, 0);
data.set(right, 441);
const frame = new AudioData({
timestamp,
data,
sampleRate: 44100,
format: 'f32-planar',
numberOfChannels: 2,
numberOfFrames: 441,
});
this.duration += frame.duration;
await this.audioWriter.write(frame);
if (this.mimeType.includes('mp3')) {
const mp3buf = this.mp3encoder.encodeBuffer(leftChannel, rightChannel);
if (mp3buf.length > 0) {
this.mp3Data.push(mp3buf);
}
}
if (this.mimeType.includes('mp3')) {
const mp3buf = this.mp3encoder.flush(); //finish writing mp3
if (mp3buf.length > 0) {
this.mp3Data.push(new Int8Array(mp3buf));
}
this.resolve(new Blob(this.mp3Data, { type: 'audio/mp3' }));
}
In an AudioWorklet
we can use top-level import
and modify sampleBlockSize
to 128
.
You should be able to incorporate the changes https://github.com/guest271314/AudioWorkletStream. FWIW for speech synthesis processing can also utilize https://github.com/guest271314/native-messaging-espeak-ng. I am currently updating https://github.com/guest271314/captureSystemAudio for MP3 support. Next I will substitute https://github.com/davedoesdev/webm-muxer.js for MediaRecorder
.
Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.
WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.
None of the existing webaudio graph nodes can accept lamejs encoded mp3. It only really makes sense as an intermediate node. My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.
Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.
Yes, it does make sense.
WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.
Not necessarily, parse, convert data to the expected TypedArray
.
None of the existing webaudio graph nodes can accept lamejs encoded mp3.
Technically it can via HTML <audio>
with MediaElementSource
or captureStream()
connected to MediaStreamAudioDestinationNode
or MediaStreamAudioSourceNode
connected to AudioWorkletNode
.
My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.
The benefit is flexibility, and fidelity, particularly for speech to text. Though Mozilla Voice does use MP3.
You can certainly pipe a MediaStreamTrack
through AudioWorkletNode
to encode the stream in "real-time" and send to other destinations and save simultaneously.
The requirement is possible.
Chrome is about to land AudioWorklet and deprecate ScriptProcessorNode.
It would be awesome if lamejs could be used a normal WebAudio node.
https://www.chromestatus.com/features/4588498229133312