Closed z-fabian closed 4 years ago
Yes, I tested train_unet_demo.py
and train_varnet_demo.py
on 2 GPUs with the fixed seed. The processes add the same slices when subsampling.
This also caused an issue with the VolumeSampler before, when different volumes were added on different processes the number of slices per GPU were not consistent anymore and made ddp on multi-gpu hang. This is resolved now too.
I am going to move the seed argument to MriModule
.
The random seed of
SliceDataset
has not been explicitly set on the experiment level. Different processes during ddp training had different random seed. This led to different parts of the training data selected by different workers ifargs.sub_sample < 1
. Now the default behavior in unet/varnet experiments isdeterministic
and the same portion of data is selected by processes.