introlab / odas_web

A desktop visualization GUI for the ODAS library
MIT License
137 stars 53 forks source link

Unable to record/transcribe #44

Open chejmadi opened 4 years ago

chejmadi commented 4 years ago

Hi, I'm running ODAS on a Raspberry Pi with Respeaker 4-mic array and running ODAS Web on a remote laptop running WSL on Windows 10. I'm somehow just not able to save any recordings once ODAS Web is running. I check the "Determine if separated audio is recorded" box. As soon as I start speaking, the microphone array picks up my voice and the terminal on my remote machine shows that "Recorder 0 started", but then shows "Recorder 0 was false active" - it then goes "Write stream 0 full", "Holding samples" and "Recorder 0 ended". The files that get saved are either of duration -1.00 seconds like in another issue here, or 1.54 seconds long, with barely any discernible sound. Also as soon as the recorder begins, the ODAS Web GUI really slows down, and I can't even kill the process on the Raspberry Pi easily.

Plus, the transcription function is not working either, although I'm not sure if that's due to this recorder problem or a separate issue of its own. I have an API key for Google's speech API (at least, I think so) but I'd like to know where one should download it from. I'm attaching my API key, you guys could confirm if it's correct or not. SpeechTranscription-f7dd9e0aeebf.zip

Thanks!

GodCed commented 4 years ago

Hi, let's start with the recording issue.

It looks like the ODAS Web process cannot write the audio file fast enough. I never used the WSL so I can't tell if the issue comes from there. ODAS Web then hangs because the buffer are getting out of hands (the slow GUI part) and stops receiving audio samples, which leads to ODAS freezing (that's why the process is hard to kill on the Pi). What is your ODAS sink config? Maybe the sample rate is juste too high and it can't write fast enough (parsing raw audio in a javascript loop is a rather suboptimal process). 16 000 is normally a safe choice.

For the transcription part, I will refer you to the Google Cloud documentation for getting your key file. Your file looks like a proper key tough. However if recording is not working I'm not sure text to speech will, so I would try to sort one issue at the time.

chejmadi commented 4 years ago

I don't know what you mean exactly by "sink config" but the config file I run for ODAS on the RPi has the sampling rate for "raw" as 16000. However the sampling frequency for "separated" and "postfiltered" are 44100.

GodCed commented 4 years ago

Yes those are the ones. They "sink" data out of ODAS. Try setting them at 16000 like your RAW (and adjust the ODAS Web configuration accordingly).

chejmadi commented 4 years ago

Okay, that makes sense. I'll try!

chejmadi commented 4 years ago

Also, should they be the same as RAW even when I'm recording on the raspberry Pi itself (without ODAS Web, just saving it to a file on the Pi)?

GodCed commented 4 years ago

Except if your application really requires a specific sample rate they should. There is no advantage to upsampling above the RAW sample rate that I know of, because you can't "create" new information that was not present in the source signal.

chejmadi commented 4 years ago

I see. So I changed the sampling frequency to 16000, but the problem persists. The audio files have recorded but are of only 1.54 seconds length, however now I can clearly hear my voice in them. But the length problem is still there. I'm going to try and upload a screenshot of my laptop terminal. I have no idea how "Recorder 23 was defined" turns up there. Then there's a bunch of (what I think are) Javascript messages. image image

chejmadi commented 4 years ago

I've also noticed that it closes the connection by itself after showing that the Write Stream is full. I haven't touched anything on the Pi end. The connection got closed somehow, even though there was no error message on the Pi's terminal.

image

image

Once the connection closes, it starts outputting this on the terminal on my laptop image

At this point, two 1.54s recordings have been processed and can be played back. Two are still processing, by the looks of things. image

And at the moment it seems like they aren't going to be processed. There's a request timeout. image

GodCed commented 4 years ago

The connection closing on the Pi is expected as ODAS simply stops without any message when it can't sink in real time.

There is a mix of request timeout from Google Speech and recording buffer full in the ODAS Studio output. Can you disable the Google Speech Transcription for now it will isolate things and make the output cleaner. Also if you could get a terminal output with proper line termination it would certainly improve readability.

As for recorder up to 34, that is really strange as recorders are created in a loop from 0 to 3.

chejmadi commented 4 years ago

Yeah the line termination drives me nuts too. Sorry. I didn't know if I can change that. Not sure if it's an electron thing or npm thing (it was a huge pain getting those two things set up on my laptop). Anyway I disabled the Google Speech Transcription and hey presto, the recorder started working again! I guess there has to be something wrong with the API key or something, I'll go through the link you shared. Thanks!

GodCed commented 4 years ago

Glad to hear it’s working. From what I seen your key file seems good, the error

Error 14: UNAVAIBLE

Leads me to think there is a network connectivity problem from the laptop to the google API.

A quick search turned out this thread.

I don’t know how networking works trough the Windows Linux subsystem but Google API seems to dislike proxies so maybe, but this is a far fetched maybe, it comes from there.