Andong-Li-speech / RTNet

implementation of Monaural Speech Enhancement with Recursive Learning in the Time Domain
44 stars 7 forks source link

Performance improvements #4

Closed jonashaag closed 4 years ago

jonashaag commented 4 years ago

For me it's a training speed improvement of ~8x

Andong-Li-speech commented 4 years ago
  • Create zeros() directly on GPU rather than move from CPU to GPU.
  • Allow for num_workers > 1 (move .cuda() out of loader)
  • Don't recompute batch_loss 3x
  • Use cudnn.benchmark
  • Use pin_memory

For me it's a training speed improvement of ~8x

Thanks for your constructive opinion! Indeed there is much optimization space for the project and you really improved it! Since I am occupied with other works recently and I will carefully modify the codes in the near future and makes the project easier to use.