Open benoitmercusot opened 3 years ago
What is the input audio stream format and what is logged at the console
?
Hey. i'm using this https://www.npmjs.com/package/node-portaudio it's says raw audio but your wav codec works just fine. I can't log / find anything that show me why it drops :/ I was just wondering if it's related to the header of my stream. (no lenght...) so maybe it's a problem.
PS : working on the port-message branch.
If you have control over the input stream you can remove WAV header and use s16le to avoid remoal of the first 44 bytes. Is input 1 channel or 2 channel? What is the sampling rate?
Here is my input for now (using my iphone ear / mic for now). Works like a charm but stops. const ai = new portAudio.AudioInput({ channelCount: 2, sampleFormat: portAudio.SampleFormat16Bit, sampleRate: 44100, deviceId : -1 // Use -1 or omit the deviceId to select the default device });
Tried other branches of your repo :) Wasm memory hearing seems to be limited by the duration of the memory allowed. Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.
According to https://www.npmjs.com/package/node-portaudio
// Note that this does not strip the WAV header so a click will be heard at the beginning
const rs = fs.createReadStream('steam_48000.wav');
Can you upload the WAV file here so that we can test using the same code?
Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.
See http://www.topherlee.com/software/pcm-tut-wavformat.html, https://github.com/guest271314/AudioWorkletStream/blob/message-port-post-message/audioWorklet.js#L12.
well it's actually not a file but a live stream from my mic. actually if you are interested and have time i can MP you a link ?
Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.
See http://www.topherlee.com/software/pcm-tut-wavformat.html, https://github.com/guest271314/AudioWorkletStream/blob/message-port-post-message/audioWorklet.js#L12.
Interesting. Thanks. I actually removed this part // accumulate 344 512 1.5 of data (to achieve real time, maybe that's what causing latency)
You should be able to record the microphone output per the NPM documentation. To capture microphone directly see also https://github.com/guest271314/SpeechSynthesisRecorder/issues/17#issuecomment-749875748, https://github.com/guest271314/setUserMediaAudioSource, https://github.com/guest271314/captureSystemAudio.
I actually removed this part // accumulate 344 512 1.5 of data (to achieve real time, maybe that's what causing latency)
This is capable of streaming without waiting for accumulation of data https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js.
Actually i'm not trying to record but to stream :) (server to browser, ie soundcard input (node server) to browser (client) ) your last link looks very nice, you think i can use it with an endless wav/raw stream ?
It is difficult to test and verify "endless" https://bugs.chromium.org/p/chromium/issues/detail?id=1161429. During testing I streamed 8 hours of audio yesterday.
Capturing entire system output or specific application audio output is possible by creating a virtual microphone and setting the source to an application or user-defined stream, at the browser capturing with navigator.mediaDevice.getUserMedia({audio: true})
avoids the need to read and write bytes individually, after configuration the task can be reduced to HTMLMediaElement.srcObject = mediaStream
.
The WebAssemblyMemory.grow()
version of the code in this repository has implementation restrictions at Chrome which differ for 32-bit and 64-bit systems, see https://github.com/wasmerio/wasmer-php/issues/121, according to the Chrome issue 4GB fo4 64-bit, I have not yet tested the maximum. A ring buffer could be written which overwrites the previously used indexes of the SharedArrayBuffer
, that is a TODO.
Thanks you very much for all the explaination. Yes navigator.mediaDevice.getUserMedia({audio: true}) but not working for what i want since i want the server to be the source :/ I'll take a look at all your links. All this is still very complicated and confused for me ! 🤪 You look way beyond everyone on the internet regarding this specifics API !! Have a good evening. Benoit.
For an "infinite" or "endless" audio stream I would try using WebTransport
for the ReadableStream
to avoid restritions on ServiceWorker
approach, in that case the server handles quic-transport
protocol, see https://github.com/guest271314/webtransport/.
opus_stream_sw.zip
Thanks a ton. i'll look into that ! Benoit
Is this issue resolved?
héhé. i'm not that fast. i have to understand all the sources you gave me :)
HI @guest271314 i was reading this whole thread. looks like exaclty what i was trying to achieve : https://github.com/wasmerio/wasmer-php/issues/121 (except using node instead of php passthru) did you make anyprogress on this ? regarding memory grow / duration ? Thank you; Benoit.
The Native Messaging, PHP passthru()
version is essentially the precursor to the QuicTransport
and WebTransport
versions. Since you read that Issue you noted the Chrome bug which limits WebAssembly.Memory.grow()
to 4GB on 64-bit system. When I was testing that code I was on a 32-bit system. I have not yet tried to reach the maximum on 64-bit. It should be possible to use more than one Memory
or SharedArrayBuffer
or Typed Array instance, and, or, overwrite the indexes that have already been parsed, using a "circular buffer" approach. I suggest testing the maximum on your system.
Craeting a virtual microphone device and using MediaStream
eliminates the need to do that (count bytes). However, it is edifying to be able to achieve either approach.
Hi @guest271314 i'm now having fun with your MessagePort.postMessage() branch. From what i understand the time limit should be limited by the Uint8Array size of the AudioWorkletProcessor. I still have inconsistency in the playback (stops occurs after few seconds, sometimes a minute) but i suspect my wav stream to be to inconsistent (too big ?). the appendBuffers log shows huge variation in the array length so there must be a issue here. Still digging ! Benoit.
Think i found a hack (ugly ?) issues indeed occurs when index was lower than offset
// magic "if" hack 😅
if( this.offset < this.index ){
for (let i = 0; i < 512; i++, this.offset++) {
if (this.offset === this.uint8.length) {
console.log(this.uint8);
break;
}
uint8[i] = this.uint8[this.offset];
}
const uint16 = new Uint16Array(uint8.buffer);
CODECS.get(this.codec)(uint16, channels);
}
The offset is the bytes read, the index is the bytes written.
well it's actually not a file but a live stream from my mic.
One solution
navigator.mediaDevices.getUserMedia({audio: true})
.then(stream => {
// do stuff with stream: MediaStream
});
Hi ! My "hack" works like a charm. Before that it actually stopped every time index was beyond offset (i suppose it make sense). Now it never stops. MediaStream was not a solution because final use is to stream from input card from another device. For now everything works as expected ! Thanks a lot for all your "WIPs"
You should be able to record the microphone output per the NPM documentation. To capture microphone directly see also guest271314/SpeechSynthesisRecorder#17 (comment), https://github.com/guest271314/setUserMediaAudioSource, https://github.com/guest271314/captureSystemAudio.
I actually removed this part // accumulate 344 512 1.5 of data (to achieve real time, maybe that's what causing latency)
This is capable of streaming without waiting for accumulation of data https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js.
HI there ! If you have one minute to show me how to implement this ? (ie where that text /bytes input comes from ?)
For the WebTransport
version "text" input originates in the browser at function call
webTransportAudioWorkletMemoryGrow('hello world')
Sending to quic-transport
URL
let data = encoder.encode(text);
await writer.write(data);
console.log('writer close', await writer.close());
input_data
here https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L138 is the same text in the process
data = subprocess.run(['./tts.sh', input_data], stdout=subprocess.PIPE)
"$1"
is the input text passed in this case to espeak-ng
https://github.com/guest271314/webtransport/blob/main/tts.sh#L2
espeak-ng -m --stdout "$1" # TODO pass, set options
the response is payload
https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L140
self.connection.send_stream_data(response_id, payload, True)
the ReadableStream
from quic-transort
server https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L183
const { readable } = stream;
that we pipeTo()
https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L184 a WrtitableStream()
, in this case grow()
a WebAssembly.Memory
(SharedArrayBuffer
) instance if necessary (see https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L8 for minimum
, maxiumum
values corresponding to audio, feel free to verify the values used, as there was no manual for how to accurate dervice those values, I learned Python and bytes necessary per second by doing).
To install aioquic
python3 -m pip install aioquic
create the necessary certificates https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L40, launch Chrome or Chromium with the appropriate flags found in the same comment block.
Note, I commented, do not use ALLOWED_ORIGINS
https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L94 for the ability to run the code at console
on any site.
awesome, thanks a lot.
Relevant to running the code at console
at any origin, there is still a restriction on doing so using AudioWorklet
due to the design being an Ecmascript Module, thus GitHub blocks loading. Once you get the code running and test at console
on this very page you will perhaps gather why I filed https://github.com/WebAudio/web-audio-api-v2/issues/109.
Hi, first of all i would like to thank you for that POC / Script. Exactly what i was looking for. Amazing work. But I ran into an issue when i started to stream a "live stream". Actually raw audio stream from a node server (using node port audio). It works well for few seconds. (maybe a minute) then sounds is muted / gone. Do you think it's buffer related ? network issue ? For information i'm trying to broadcast audio without latency from node portaudio to the browser. I thought your script was a good way to start ! Thanks for your time. Benoit