Closed duytruong closed 5 years ago
@duytruong — the CPU backend is a lot slower than the CUDA backend, so training will be slow, especially on Librispeech. A few questions:
reportiters
value here? Was there evidence it was processing some samples?@jacobkahn
--reportiters=1
to my ./Train ... command. I didn't know it was processing or not, but the runtime
in logs was 00:00:00 or 00:00:01 hence I thought there was no processing here.@duytruong — this sounds like an issue reading in data if operations are completing immediately. Can you put some logs in the training pipeline to ensure data is being loaded properly? Check here perhaps: Train.cpp#L490.
@jacobkahn Thanks, I'll give it a try.
Closing due to inactivity — feel free to reopen if you're still having trouble.
Hello,
I built wav2letter with CPU option and run on aws ec2 instance (c5.2xlarge - 8 vCPU / 16G RAM / Ubuntu 16.04 LTS). I followed the 1-librispeech_lean tutorial but the training job runs for ~ 18 hours and stuck in epoch 1.
Is it a normal situation with CPU training or something went wrong? Thanks for your help!