noahchalifour / rnnt-speech-recognition

End-to-end speech recognition using RNN Transducers in Tensorflow 2.0
MIT License
241 stars 78 forks source link

Multi-thread Audio Conversion #24

Closed shaunm closed 4 years ago

shaunm commented 4 years ago

The original script was not that efficient. I added in modifications to allow the conversion loop to run concurrently on multiple threads. A second command-line argument now allows the user to specify a thread count, with the default being single-threaded. This significantly reduces the time needed to prepare the dataset.

shaunm commented 4 years ago

Any particular reason you didn't use parallel or xargs --max-procs=$(nproc)?

Yes, since the script uses a for loop to access a static list of files, running them in parallel could result in concurrency issues where both scripts attempt to convert the same file or a file that has already been converted/removed.

I could be wrong, I am not 100% familiar with the inner working of either command. That assessment may be wrong or there may be a way around this, but editing the script to thread it in the loop seemed simpler.

dyc3 commented 4 years ago

Both commands take inputs on stdin and distribute those inputs to child processes. I would opt for using xargs because it's included on most systems by default, and the way you've done it here is hard to decipher IMO.

I tried using parallel and had to move the conversion and deletion to a separate script file to get it to work

shaunm commented 4 years ago

I agree that my modifications are harder to decipher, but the avg user may not be well versed in in the *nix environment to run the script concurrently and the operation would take longer than necessary. The solution I am proposing allows them to be aware that they can run the operation faster and allow them to do so with just an additional flag.

I understand if the maintainers are reluctant to allow this pull request because of perceived bloat, but I think it would be a helpful addition given the script is used solely for preparing the training data.

noahchalifour commented 4 years ago

@sman1 Thanks for the updated script, I know the original was not very efficient it was just something quickly wrote to get the job done.