Multiprocessing - Githubissues

Jiaxuan777 commented 3 months ago

Hello, I'm glad to see your project, which intergrates 3 main methods for WSI normalizing. It's useful for what I am doing now. But I have a problem to inquire you. I have put this WSI Normalizer in my dataloader, and the transform part is in the function getitem(). When I set the 'num_workers': > 1 to accelerate. It will cause the error like this: I spent two days on it. However, I cannot solve it except for setting the ‘num workers = 0’,but my data is so big, this way is too slow. And I referred to this settings.https://github.com/Peter554/StainTools/issues/43 But it still no use. Can you help me? Thank you!

HaoyuCui commented 3 months ago

The code in this repository calls all cores when processing a single image, so there is no need for further multithreading. Considering the I/O overhead and processing time for a single image, and the fact that the same data is loaded every epoch (equivalent to repeating the normalization every epoch, which is undoubtedly very inefficient), I don’t recommend calling this method in dataloader’s __getitem__(). A recommended approach is to stain normalize your entire dataset in the pre-process phase, and then call dataloader and its __getitem__() method on the processed dataset, rather than calling it during training. Hope it will help you.

Jiaxuan777 commented 3 months ago

Thank you. Now I know how to deal with it.

HaoyuCui / WSI_Normalizer

Multiprocessing #2