TensorSpeech / TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
https://huylenguyen.com/asr
Apache License 2.0
938 stars 245 forks source link

Setup: Could not find a version that satisfies the requirement ctc-decoders #11

Closed stefan-falk closed 4 years ago

stefan-falk commented 4 years ago

Running pip install . after cloning the repository gives me:

ERROR: Could not find a version that satisfies the requirement ctc-decoders (from tiramisu-asr==0.0.1) (from versions: none)
ERROR: No matching distribution found for ctc-decoders (from tiramisu-asr==0.0.1)

How can I set this up correctly?

Enrionment

$ python --version
Python 3.7.0
$ uname -sv
Linux #100~16.04.1-Ubuntu SMP Wed Apr 22 23:56:30 UTC 2020
nglehuy commented 4 years ago

Sorry that I didn't write README properly. The repo of ctc-decoders is in the file setup.sh, you can install it manually :D I'm working on NAN issue in Warp RNNT Loss so Transducer models are not ready yet. :(

nglehuy commented 4 years ago

I updated README (include instructions for setting up env) and remove ctc-decoders from requirements so that we can install it manually if needed.

nglehuy commented 4 years ago

The rnnt models are able to train now, and usage of the scripts to install external dependencies are specified in README. I'll close this here. Let me know if you have any other issues :D

stefan-falk commented 4 years ago

@usimarit Hello! Thanks for working on this! I'll give it a try as soon as I find the time. :)

stefan-falk commented 4 years ago

@usimarit Regarding the NAN issue: I had this issue too when I was using https://github.com/noahchalifour/rnnt-speech-recognition. I was Able to fix it though after following https://github.com/noahchalifour/rnnt-speech-recognition/issues/31#issuecomment-637499806 - not sure if that will help.

nglehuy commented 4 years ago

@usimarit Regarding the NAN issue: I had this issue too when I was using https://github.com/noahchalifour/rnnt-speech-recognition. I was Able to fix it though after following https://github.com/noahchalifour/rnnt-speech-recognition/issues/31#issuecomment-637499806 - not sure if that will help.

I had it because I was giving wrong input_lengths. The model reduces time dimension but I didnt reduce the input_lengths fed to rnnt_loss, therefore the time in acts and the input_lengths mismatch, so it gave Nan values (and also output mismatch likelihood). I fixed it and it works now :laughing:

stefan-falk commented 4 years ago

Awesome! Thank you :)