lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.
https://lhotse.readthedocs.io/en/latest/
Apache License 2.0
904 stars 204 forks source link

More similar mean batch duration across nodes with DynamicBucketingSampler in multi-GPU training #1309

Closed lifeiteng closed 1 month ago

lifeiteng commented 3 months ago

Update #863

Fix #857

pzelasko commented 3 months ago

Does it actually help you achieve greater GPU utilization? Would be great if you can share any benchmarks. I wasn't able to verify the improvements in the original PR.

lifeiteng commented 3 months ago

Does it actually help you achieve greater GPU utilization? Would be great if you can share any benchmarks. I wasn't able to verify the improvements in the original PR.

It will be around 2x faster on 8xGPU.

pzelasko commented 3 months ago

It will be around 2x faster on 8xGPU.

Wow!! That's fantastic! Let me know if you need any help along the way.

pzelasko commented 1 month ago

I'm closing this PR in favor of #1341 which has a broader support for lazy cutsets (webdataset, lhotse shar) and both map-style and iterable-style datasets. Thank you for your contributions @lifeiteng. Would be curious to see what kind of speedup you manage to gain in your workflow vs the expected 2x you mentioned earlier.