Closed JunMa11 closed 5 years ago
Mhm difficult. I don't have any experience with Windows. @justusschock maybe knows what to do?
Irrespective of this particular issue I think data augmentation should not be done offline unless absolutely necessary. You can get a lot more variability if you do it online. BraTS is tough due to the four modalities, but you can train a 3D UNet with maybe ~5 CPU cores no problem when running data augmentation on the fly
That's strange. I tested this on two separate windows machines and this works on both of them. I sometimes get a broken pipe and/or lock due to race conditions if I try to access the same hdf5 file at the same time. Saying that I only got this issue with hdf5 meaning it should not be a general batchgenerators issue.
Can you maybe check if the same applies to you (and maybe create a gist containing a minimum working example to check on this)?
Are you maybe using additional custom multiprocessing within your loader/dataset?
EDIT : Why do you restart the generator directly after creating it?
Hi,
Why do you restart the generator directly after creating it?
That is on me. The code is something I wrote. See, if you initialize the MTA it will not start generating batches right away. It will do so only after you requested the first batch OR if you restart it. It is just a habit, but I usually initialize the data augmentation pipelin, then initialize the network and so on. If I restart the MTA then it will already start generating batches while the main process is busy with other things. It is a little more efficient. But at the end of the day it doesn't matter because a training can run for days. So a few seconds at the start wont change much.
@JunMa11 how did you install your python environment? Conda?
@FabianIsensee maybe you might want to include a restart at the end of the initialization? I think this would match the excepted behavior more than having to restart it manually.
I also think the only thing that matters is how you installed the package itself, as conda simply provides an encapsulated environment as long as you don't install packages via conda (an I can confirm it works well with conda).
I've had people with random issues caused by conda environments. That's why I was asking =)
@FabianIsensee Yes, I install the python environment by conda.
Hi, @justusschock thanks for your reply very much.
I only got this issue with hdf5 meaning it should not be a general batchgenerators issue. Can you maybe check if the same applies to you (and maybe create a gist containing a minimum working example to check on this)?
Sorry, I do not know what hdf5 means. I provide an ErrorDemo to reproduce my error.
The enviroment is win10, python 3.6, Anaconda3-5.1.0-Windows-x86_64
.
Are you maybe using additional custom multiprocessing within your loader/dataset?
I do not have experience on multiprocessing. Would it be possible for you to give me more insights?
Hi, @FabianIsensee thanks for your comment on offline data augmentation.
Irrespective of this particular issue I think data augmentation should not be done offline unless absolutely necessary. You can get a lot more variability if you do it online.
I agree with you that online augmentation can obtain more variability. My motivation of offline data augmentation is following.
I look at my whole tumor segmentation on brats 2018, most of the cases can get good results (Dice>0.88), but few "hard" cases get very low Dice (0.6-0.7). I want to do some offline data argumentation for the case with low Dice score and do online data augmentation during training, too. In this way, I hope the network can learn these "hard" cases better. Could you share your comment on this idea?
I'll test this on Monday. Unfortunately my local machine is running linux.
If you are not familiar with multiprocessing, you most likely don't have a custom one. I thought you might have additional multiprocessing inside your BraTS2017DataLoader3D
which may have caused the problem, but this does not seem to be the case, so nevermind :)
what else do you have installed inside your environment?
Regarding the augmentation. Maybe it would be worth considering a weighted sampling together with online augmentation to present hard cases more frequently?
Hi @justusschock Thanks for your quick reply. These screenshots show the python packages in my environment.
Weighted sampling is a good idea that I missed. Thank you very much.
I agree with @justusschock . You should probably sample difficult cases more often rather than augmenting them offline.
So I just got the time to test this and I absolutely can't reproduce the error.
I tried the script you provided (which should be similar to the one by @FabianIsensee ). The only thing I noticed: I had to clone the repo again manually, since you mixed the setup code with the actual implementation (probably you just copied it there to get the imports working without an install). After I did a clean clone and a clean install everything worked like a charm (even with multiple epochs).
The steps I did are:
conda create -n batchgen_test python=3.6
conda activate batchgen_test
git clone https://github.com/MIC-DKFZ/batchgenerators
(maybe this has to be executed in a git bash)cd batchgenerators
pip install -e .
cd YOUR/PATH/HERE
python brats2017_dataloader_3D.py
and the output was like
python brats2017_dataloader_3D.py
(4, 128, 128, 128)
(4, 128, 128, 128)
(4, 128, 128, 128)
(4, 128, 128, 128)
(4, 128, 128, 128)
(4, 128, 128, 128)
Running 3 epochs took a total of 38.92 seconds with time per epoch being [25.76951551437378, 3.648458957672119, 9.5059654712677]
The time is not representative since I'm running some heavily CPU-consuming tasks in parallel.
Can you maybe try this and confirm if this works?
Hi, @justusschock . I very appreciate your time and valuable help. Following your guidance, batchgeneraters
works well now. Thank you very much.
Dear DKFZ,
Thanks for the great repo.
I want to use this tool for off-line argumentation on Win10, and I follow the code in examples/brats2017. All things work well except the
multiprocessing
.I paste the error information. Would it be possible for you to tell me how to solve the problem? My goal is off-line argumentation. I do not pursue efficience and only want it can work.
Following error occurred:
I am looking forward to your reply.