Open 15858805253yzl opened 1 year ago
This error means some data are corupted, you could check whether the NIFTI zip file can be loaded to numpy array.
This error means some data are corupted, you could check whether the NIFTI zip file can be loaded to numpy array.
Hello, I encountered the same issue. I have checked the data and there is no problem; it can be loaded and processed. What's more, It works normally with nnUNet and other frameworks as well, and using the UNETR network under the MONAI framework also functions without issues. However, when I switch the network to SwinUNETR, the above error occurs. Could you please give me some advice?
This does appear to be an issue with the data, the exception (if yours is the same) originates deep inside the Nibabel library. If you are encountering an issue from the same place in the Nibabel library I can only suggest that you check your data again to be sure it isn't corrupted, that is find which image the error occurs on and try to load it manually with Nibabel.
I have been having the same issue, but with a different model and a different private dataset. My dataset is .nii.gz. When I load each individual file with nibabel I have no issues.
When I load all files with MONAI during training, it successfully loads all images for 181 epochs, and then it crashes with the CRC error. I have retried this experiment, and it consistently crashes at epoch 182
What is strange to me is that it seems that the first 181 times the loading of the nifti files goes without issues, but then the 182nd time it crashes. To me this tastes like a memory leak, but I do not observe a significant increase in memory consumption during those 182 epochs.
I am truly puzzled by this CRC error. Any ideas would be appreciated, though I understand that for you this is extremely hard to debug (and for me it is too, those 181 epochs take 1.5 days to complete)
This does sound like a fault some place else in your setup, that is in Python, Nibabel, Pytorch (when we convert loaded arrays to tensors) and not really MONAI, or actual issues with memory or other hardware fault. Is this data being loaded with the Dataset
class or one of the persistent/caching dataset classes? If it's just Dataset
then it's not any of the optimisation mechanisms we have with data. I can only suggest to create a test script which just goes through your data creating batches as normal but then doing no training/evaluation with it. This would be much faster and if there's an issue with memory or something in MONAI it may help isolate it. If it consistently still produces the error then eliminating other variables such as transforms one by one can reduce the possible sources of error. I haven't seen this sort of error before so I can only recommend this experimental approach.
Traceback (most recent call last): File "main.py", line 236, in
main()
File "main.py", line 104, in main
main_worker(gpu=0, args=args)
File "main.py", line 211, in main_worker
accuracy = run_training(
File "/home/hf524/lz/SwinUNETR.../BRATS21/trainer.py", line 157, in run_training
train_loss = train_epoch(
File "/home/hf524/lz/SwinUNETR.../BRATS21/trainer.py", line 32, in train_epoch
for idx, batch_data in enumerate(loader):
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in next
data = self._next_data()
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
return self._process_data(data)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
data.reraise()
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/_utils.py", line 429, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 91, in apply_transform
return _apply_transform(transform, data, unpack_items)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 55, in _apply_transform
return transform(parameters)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/io/dictionary.py", line 154, in call
data = self._loader(d[key], reader)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/io/array.py", line 266, in call
img_array, meta_data = reader.get_data(img)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/data/image_reader.py", line 942, in get_data
data = self._get_array_data(i)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/data/image_reader.py", line 1016, in _get_array_data
_array = np.array(img.get_fdata(dtype=self.dtype))
File "/home/hf524/.local/lib/python3.8/site-packages/nibabel/dataobj_images.py", line 355, in get_fdata
data = np.asanyarray(self._dataobj, dtype=dtype)
File "/home/hf524/.local/lib/python3.8/site-packages/nibabel/arrayproxy.py", line 391, in array
arr = self._get_scaled(dtype=dtype, slicer=())
File "/home/hf524/.local/lib/python3.8/site-packages/nibabel/arrayproxy.py", line 358, in _get_scaled
scaled = apply_read_scaling(self._get_unscaled(slicer=slicer), scl_slope, scl_inter)
File "/home/hf524/.local/lib/python3.8/site-packages/nibabel/arrayproxy.py", line 332, in _get_unscaled
return array_from_file(self._shape,
File "/home/hf524/.local/lib/python3.8/site-packages/nibabel/volumeutils.py", line 523, in array_from_file
n_read = infile.readinto(data_bytes)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/gzip.py", line 292, in read
return self._buffer.read(size)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/gzip.py", line 470, in read
self._read_eof()
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/gzip.py", line 516, in _read_eof
raise BadGzipFile("CRC check failed %s != %s" % (hex(crc32),
gzip.BadGzipFile: CRC check failed 0x8a39933f != 0x65ede34c
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 91, in apply_transform return _apply_transform(transform, data, unpack_items) File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 55, in _applytransform return transform(parameters) File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/compose.py", line 173, in call input = apply_transform(transform, input, self.map_items, self.unpack_items, self.log_stats) File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 118, in apply_transform raise RuntimeError(f"applying transform {transform}") from e RuntimeError: applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7fc05368a0d0>
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/data/dataset.py", line 105, in getitem
return self._transform(index)
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/data/dataset.py", line 91, in _transform
return apply_transform(self.transform, data_i) if self.transform is not None else data_i
File "/home/hf524/anaconda3/envs/yzl/lib/python3.8/site-packages/monai/transforms/transform.py", line 118, in apply_transform
raise RuntimeError(f"applying transform {transform}") from e
RuntimeError: applying transform <monai.transforms.compose.Compose object at 0x7fc05368a490>