Open FredTheNoob opened 1 month ago
Hi Fred,
I was facing same issue, below is my solution:
@app.websocket("/transcribe/streaming")
async def websocket_endpoint(websocket: WebSocket):
await websocket.accept()
buffer = b""
while True:
out = []
raw_bytes = await websocket.receive_bytes()
if not raw_bytes:
break
buffer += raw_bytes
if buffer != b"":
sf_buffer = soundfile.SoundFile(io.BytesIO(buffer), channels=1, endian="LITTLE", samplerate=SAMPLING_RATE,
subtype="PCM_16", format="RAW")
audio, _ = librosa.load(sf_buffer, sr=SAMPLING_RATE, dtype=np.float32)
out.append(audio)
buffer = b""
if out:
audio_data = np.concatenate(out)
audio_buffer = np.array([], dtype=np.float32)
audio_buffer = np.append(audio_buffer, audio_data)
try:
segments, info = recognize_service.recognize(audio=audio_buffer, beam_size=5, language="en")
result = {
"language": info.language,
"language_probability": info.language_probability,
"segments": [
{
"start": segment.start,
"end": segment.end,
"text": segment.text
} for segment in segments
],
}
await websocket.send_json(data=result)
except Exception as e:
print(e)
I have the following frontend code which sends audio data over a websocket in the browser (using the microphone):
It uses the MediaRecorder API to send an audio chunk every 2 seconds. This is recieved on the backend like this:
main.py:
ASR.py:
The issue happens when I try to clear the audio buffer. My thought is to clear the buffer every time I detect a punctuation meaning a sentence has ended. However clearing the buffer throws the following error: