galv / lingvo-copy

Apache License 2.0
4 stars 0 forks source link

Ag/featurize #10

Closed galv closed 3 years ago

galv commented 4 years ago

LEt's just make this a PR for now for making it easy to compare your changes.

galv commented 4 years ago

Please read commit 276d902 for explanation.

Also, please don't canibalize create_asr_features.py. By doing that, it makes it impossible for us to rerun librispeech feature extraction in the future. In cases like these where we can't really unit-test everything, it's better to just copy-and-modify, which I did in my commit (I copied your changes to a new file called create_peoples_speech_asr_features.py). Can you please revert the changes you made to create_asr_features.py before merging?

galv commented 4 years ago

BTW, I estimated 86.5 hours to featurize the training set on a single machine with a single job. But it wasn't running at full capacity. I suspect that you booted up a 16 core machine overnight and did ~20 separate shards, this would run to completion relatively quickly.

anjaligopi commented 3 years ago

Closing this in favor of #11