neurostatslab / vocalocator

Deep neural networks for sound source localization and vocalization attribution.
MIT License
2 stars 0 forks source link

Have separate IterableDataloder and Dataloader for variable-length and fixed-length data #25

Closed Aramist closed 1 year ago

Aramist commented 1 year ago

As the IterableDataloader cannot be parallelized across cores, it is much slower in practice than the Dataloader. Therefore, when fixed-length data are requested (via the crop length config parameter), it would be better to have a simpler subclass of Dataset.

Deadline: 2023-06-16

Aramist commented 1 year ago

I've determined that the IterableDataset setup is not inherently much slower than the MapDataset, but it does require a bit of special code to prevent each worker process from duplicating the entire dataset. At present, the epochs are num_workers times larger than intended and running much slower than expected as a result