Using STT - Githubissues

mailong25 / self-supervised-speech-recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework

379 stars 114 forks source link

Using STT #24

Open NhacBatQuan opened 3 years ago

NhacBatQuan commented 3 years ago

Hi Mailong, Thank you for helping us the last time, after we terminate files with under 2s, we had able to finetune the model with wer =16, bit when we using STT with Colab, we come to a problem: ipykernel_launcher.py: error: unrecognized arguments: -f / mnt/disk2/data Can you help us how to modify the code to solve this problem? We really need your help.

mailong25 commented 3 years ago

If you can, try to download the model and run inference on your own local machine. . If you can't, then In your colab try: import sys sys.argv.append('/test/data') print(sys.argv) It should print out something like this:

['/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py',
 '-f',
 '/root/.local/share/jupyter/runtime/kernel-dff03990-da23-4aef-8dd0-6fdf9d4cbee2.json',
 '/test/data']

In your case the sys.argv only have 3 elements, which may be the problem\ You can modify this line of code to fix the problem:

NhacBatQuan commented 3 years ago

it turns out the error caused by cargparse having conflict với notebook so we had tried pass arg -f and it working but come to a new problem no module named 'examples.speech_recognition'. Is it because we have a problem with the installation or it comes from the import ?

mailong25 commented 3 years ago

the "examples.speech_recognition.w2l_decoder" is located inside the installed fairseq directory. Could you please changing the import from examples.speech_recognition.w2l_decoder to the actual path of installed fairseq ?

sa-thangbn commented 3 years ago

@NhacBatQuan You just copy folder examples into usr/local on google colab. (ex: !cp -r /content/self-supervised-speech-recognition/fairseq/examples /usr/local/lib/python3.7/dist-packages). It's work!

CSLujunyu commented 3 years ago

I follow the step(4.1 Make prediction on multiple audios programmatically), however, I meet this message(ImportError: cannot import name 'Transcriber' from 'stt').

Could you provide the right version of stt?

mailong25 commented 3 years ago

You should call the import from the inside of "self-supervised-speech-recognition" directory