Closed GowthamE7 closed 1 year ago
Hi @GowthamE7, could you please refer to this discussion and try to use the command shared there? Thanks!
I am getting error in folds:
Hi @GowthamE7, did you use the data json under this folder which contains "fold"?
Hi @KumoLiu, I tried using that file but I got ended in this error: So I tried base json file by downloading the dataset:
Hi @GowthamE7, from the screenshot you shared I can't see the cause of the problem. But you should at least use the JSON file in the tutorial repo. And also share the whole error message that you get when you use that JSON file. Maybe I could take a deep look into it.
@KumoLiu, I tried the json file inside the tutorials but it throws error. JSON snapshot from tutorials: But if I use the json which is downloaded while downloading the dataset I can run the code without Error. JSON snapshot from dataset:
Hi @GowthamE7, I noticed that here has a different format of the file path, maybe it's the reason.
Hi @KumoLiu, Thanks for the answer I can now run the tutorial. I tried the same procedure to my custom dataset but I cannot able to run the training. DataLoader Error:
Hi @GowthamE7, could you please share the whole error message? Thanks!
@KumoLiu , This is the error message
image_only=False
has been deprecated since version 1.1. It will be changed to image_only=True
in version 1.3.
warn_deprecated(argname, msg, warning_category)
[info] number of GPUs: 1
[info] world_size: 1
train_files_w: 103
train_files_a: 104
val_files: 53
/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py:554: UserWarning: This DataLoader will create 6 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
2023-02-20 15:27:51.872644 - Length of input patch is recommended to be a multiple of 32.
num_epochs 1000
num_epochs_warmup 500
num_epochs_per_validation 20
[info] amp enabled
2023-02-20 15:27:55.253999: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-20 15:27:56.182970: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-02-20 15:27:56.183134: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2023-02-20 15:27:56.183157: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.epoch 1/1000
learning rate is set to 0.025
[2023-02-20 15:28:02] 1/52, train_loss: 1.0072
Process Process-4:
Process Process-6:
Process Process-2:
Process Process-5:
Process Process-3:
Exception ignored in: <function _MultiProcessingDataLoaderIter.del at 0x7f2bb0a24d30>
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1466, in del
self._shutdown_workers()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1430, in _shutdown_workers
w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
File "/usr/lib/python3.8/multiprocessing/process.py", line 149, in join
res = self._popen.wait(timeout)
File "/usr/lib/python3.8/multiprocessing/popen_fork.py", line 44, in wait
if not wait([self.sentinel], timeout):
File "/usr/lib/python3.8/multiprocessing/connection.py", line 931, in wait
ready = selector.select(timeout)
File "/usr/lib/python3.8/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/content/ref_api_work_dir/dints_0/scripts/search.py", line 653, in
Hi @GowthamE7, I see KeyboardInterrupt
in your error message, I would like to say you may interrupt the process unintentionally. Thanks!
@KumoLiu, I didn't press ctrl+c. The code is getting stopped by itself. I checked it by running the code again and not touching my laptop and I am getting the same error.
/usr/local/lib/python3.8/dist-packages/monai/utils/deprecate_utils.py:321: FutureWarning: monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument image_only=False
has been deprecated since version 1.1. It will be changed to image_only=True
in version 1.3.
warn_deprecated(argname, msg, warning_category)
[info] number of GPUs: 1
[info] world_size: 1
train_files_w: 3
train_files_a: 4
val_files: 7
^C
hi @GowthamE7, in your latest run, the log did not show the first iteration (even the smaller amount of data). But your previous run showed. can you confirm that the error messages are the same?
Hi @GowthamE7
/usr/local/lib/python3.8/dist-packages/monai/utils/deprecate_utils.py:321: FutureWarning: monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument
image_only=False
has been deprecated since version 1.1. It will be changed toimage_only=True
in version 1.3. warn_deprecated(argname, msg, warning_category)
The message here, it's just a warning here, you can filter this warning message by adding image_only=True
in LoadImage
, but it's not the issue and won't stop your process.
[info] number of GPUs: 1 [info] world_size: 1 train_files_w: 3 train_files_a: 4 val_files: 7
Here is just some logging information.
^C
And here is the 'ctrl+c' I think.
Hi @dongyang0122, I tried to run many times but the code is stopping because of "^C", but I din't press ctrl+c.Thanks!
Hi @GowthamE7
/usr/local/lib/python3.8/dist-packages/monai/utils/deprecate_utils.py:321: FutureWarning: monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument
image_only=False
has been deprecated since version 1.1. It will be changed toimage_only=True
in version 1.3. warn_deprecated(argname, msg, warning_category)The message here, it's just a warning here, you can filter this warning message by adding
image_only=True
inLoadImage
, but it's not the issue and won't stop your process.[info] number of GPUs: 1 [info] world_size: 1 train_files_w: 3 train_files_a: 4 val_files: 7
Here is just some logging information.
^C
And here is the 'ctrl+c' I think.
Hi @KumoLiu, Thanks for the explanation.
RuntimeError Traceback (most recent call last) Input In [109], in <cell line: 7>() 5 datastats_file = os.path.join(work_dir, "data_stats.yaml") 6 analyser = DataAnalyzer(datalist_file, dataroot, output_path=datastats_file) ----> 7 datastat = analyser.get_all_case_stats() 9 print("datalist file: ", os.path.abspath(datalist_file)) 10 print("dataroot path: ", os.path.abspath(dataroot))
File ~/conda/lib/python3.8/site-packages/monai/apps/auto3dseg/data_analyzer.py:235, in DataAnalyzer.get_all_case_stats(self, key, transform_list) 232 if not has_tqdm: 233 warnings.warn("tqdm is not installed. not displaying the caching progress.") --> 235 for batch_data in tqdm(dataloader) if has_tqdm else dataloader: 237 batch_data = batch_data[0] 238 batch_data[self.image_key] = batch_data[self.image_key].to(self.device)
File ~/conda/lib/python3.8/site-packages/tqdm/std.py:1195, in tqdm.iter(self) 1192 time = self._time 1194 try: -> 1195 for obj in iterable: 1196 yield obj 1197 # Update and possibly print the progressbar. 1198 # Note: does not call self.update(1) for speed optimisation.
File ~/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py:652, in _BaseDataLoaderIter.next(self) 649 if self._sampler_iter is None: 650 # TODO(https://github.com/pytorch/pytorch/issues/76750)%3C/span%3E) 651 self._reset() # type: ignore[call-arg] --> 652 data = self._next_data() 653 self._num_yielded += 1 654 if self._dataset_kind == _DatasetKind.Iterable and \ 655 self._IterableDataset_len_called is not None and \ 656 self._num_yielded > self._IterableDataset_len_called:
File ~/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py:1347, in _MultiProcessingDataLoaderIter._next_data(self) 1345 else: 1346 del self._task_info[idx] -> 1347 return self._process_data(data)
File ~/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py:1373, in _MultiProcessingDataLoaderIter._process_data(self, data) 1371 self._try_put_index() 1372 if isinstance(data, ExceptionWrapper): -> 1373 data.reraise() 1374 return data
File ~/conda/lib/python3.8/site-packages/torch/_utils.py:461, in ExceptionWrapper.reraise(self) 457 except TypeError: 458 # If the exception takes multiple arguments, don't try to 459 # instantiate since we don't know how to 460 raise RuntimeError(msg) from None --> 461 raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/transform.py", line 102, in apply_transform File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/transform.py", line 66, in _apply_transform File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/io/dictionary.py", line 154, in call File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/io/array.py", line 266, in call RuntimeError: LoadImage cannot find a suitable reader for file: /home/jovyan/Task04_Hippocampus/imagesTr/hippocampus_367.nii.gz. Please install the reader libraries, see also the installation instructions: https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies. The current registered: [<monai.data.image_reader.NumpyReader object at 0x7f3424368fa0>, <monai.data.image_reader.PILReader object at 0x7f34243689d0>].
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/transform.py", line 102, in apply_transform File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/transform.py", line 66, in _apply_transform File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/compose.py", line 174, in call File "/home/jovyan/conda/lib/python3.8/site-packages/monai/transforms/transform.py", line 129, in apply_transform RuntimeError: applying transform <monai.transforms.io.dictionary.LoadImaged object at 0x7f34243686a0>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jovyan/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/jovyan/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/jovyan/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
Hi @GowthamE7, from the error message I see that LoadImage cannot find a suitable reader for file
. I would like to say it may be an environment issue.
RuntimeError: LoadImage cannot find a suitable reader for file: /home/jovyan/Task04_Hippocampus/imagesTr/hippocampus_367.nii.gz.
Please install the reader libraries, see also the installation instructions:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies.
The current registered: [<monai.data.image_reader.NumpyReader object at 0x7f3424368fa0>, <monai.data.image_reader.PILReader object at 0x7f34243689d0>].
Could you please check your environment and install all the requirements? You can find some help from here Hope it can help you, thanks!
Hi @KumoLiu, I have install all the dependencies but I am still getting the error. thanks!
Hi @GowthamE7, could you please try pip show nibabel
in the terminal and show me the output?
Hi @KumoLiu,
Hi @KumoLiu,
Oh, I forgot you train in the jupyter, could you please try the same command in the jupyter? If it takes out the same result, then it's wired. But if not, it may due to the different env.
@KumoLiu,
Hi @GowthamE7, then it may be due to the wrong path, could you please try this command in the cell?
from monai.transforms import LoadImage
data_path = "./imagesTs/hippocampus_267.nii.gz" # any data path in the json file you used
test = LoadImage(image_only=True)(data_path)
I tried the example on Auto3Dseg model and i couldn't able to start the training in my colab with GPU enabled.