Open PSacfc opened 2 months ago
We implemented a highly customized IterableDataset for dealing with the large dataset size, and num_workers works differently compared to a standard pytorch dataset. We did not find data loading to be the bottleneck so we always set num_workers=1.
Thanks for your excellent work. I got a question about why performance will decrease when num_workers > 1. What if only one single dataset is used