About num_workers > 1 - Githubissues

microsoft / ClimaX

Foundation model for weather & climate

https://microsoft.github.io/ClimaX/

MIT License

610 stars 83 forks source link

About num_workers > 1 #48

Open PSacfc opened 2 months ago

PSacfc commented 2 months ago

Thanks for your excellent work. I got a question about why performance will decrease when num_workers > 1. What if only one single dataset is used

tung-nd commented 2 months ago

We implemented a highly customized IterableDataset for dealing with the large dataset size, and num_workers works differently compared to a standard pytorch dataset. We did not find data loading to be the bottleneck so we always set num_workers=1.