Open syorami opened 5 years ago
This is not expected. I didn't try using DataParallel with this code. Maybe this could be due to using a non-standard dataloader, I wonder if it would be better with default one: https://github.com/lopuhin/kaggle-imet-2019/blob/f1ec0827149a8218430a6884acf49c27ba6fcb1f/imet/main.py#L22
Yes, it's because of ThreadingDataLoader. I guess it's because that ThreadingDataLoader
will start a new ThreadPool at every iteration with no limit and even without nn.DataParallel
I can still notice cpus are fully loaded. When I switch from ThreadingDataLoader
to original torch DataLoader, the problem is solved.
Let's keep the issue open in case other people run into this
Fine. I just click again by mistake.
When I add
nn.DataParallel
into the code so that I can use multiple gpus with default worker num, I find that all my cpus are fully loaded. I haven't check the code but I guess it's because of multiprocessing implementations. Please tell me if this is abnormal or just a training feature.