lopuhin / kaggle-imet-2019

91 stars 24 forks source link

full cpu usage #2

Open syorami opened 5 years ago

syorami commented 5 years ago

When I add nn.DataParallel into the code so that I can use multiple gpus with default worker num, I find that all my cpus are fully loaded. I haven't check the code but I guess it's because of multiprocessing implementations. Please tell me if this is abnormal or just a training feature.

ACBC1AD4-C098-43FF-B8D8-F7C296D8C92C

lopuhin commented 5 years ago

This is not expected. I didn't try using DataParallel with this code. Maybe this could be due to using a non-standard dataloader, I wonder if it would be better with default one: https://github.com/lopuhin/kaggle-imet-2019/blob/f1ec0827149a8218430a6884acf49c27ba6fcb1f/imet/main.py#L22

syorami commented 5 years ago

Yes, it's because of ThreadingDataLoader. I guess it's because that ThreadingDataLoader will start a new ThreadPool at every iteration with no limit and even without nn.DataParallel I can still notice cpus are fully loaded. When I switch from ThreadingDataLoader to original torch DataLoader, the problem is solved.

lopuhin commented 5 years ago

Let's keep the issue open in case other people run into this

syorami commented 5 years ago

Fine. I just click again by mistake.