I trained a model using the nnuet framework on an Azure VM. It worked great. I then tested it on new images and the nnUnet predictor worked well too.
Now I have a BIG batch of images to score and I'm creating a pipeline job for that. I was able to preprocess DICOM images in the pipeline job and when I read the first Nifti file, the process stopped right after the score was finished and it is going to write down the result prediction.
Has anyone had any issue like that? Thank you in advance.
This is the error:
Traceback (most recent call last):
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, self._kwargs)
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/nnunetv2/inference/data_iterators.py", line 58, in preprocess_fromfiles_save_to_queue
raise e
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/nnunetv2/inference/data_iterators.py", line 50, in preprocess_fromfiles_save_to_queue
target_queue.put(item, timeout=0.01)
File "", line 2, in put
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/managers.py", line 821, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 568, in reduce_storage
fd, size = storage._share_fdcpu()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/storage.py", line 294, in wrapper
return fn(self, *args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/storage.py", line 364, in _share_fdcpu
return super()._share_fdcpu(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: unable to write to file : No space left on device (28)
I trained a model using the nnuet framework on an Azure VM. It worked great. I then tested it on new images and the nnUnet predictor worked well too. Now I have a BIG batch of images to score and I'm creating a pipeline job for that. I was able to preprocess DICOM images in the pipeline job and when I read the first Nifti file, the process stopped right after the score was finished and it is going to write down the result prediction. Has anyone had any issue like that? Thank you in advance.
This is the error:
Traceback (most recent call last): File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, self._kwargs) File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/nnunetv2/inference/data_iterators.py", line 58, in preprocess_fromfiles_save_to_queue raise e File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/nnunetv2/inference/data_iterators.py", line 50, in preprocess_fromfiles_save_to_queue target_queue.put(item, timeout=0.01) File "", line 2, in put
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/managers.py", line 821, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 568, in reduce_storage
fd, size = storage._share_fdcpu()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/storage.py", line 294, in wrapper
return fn(self, *args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/azureml-envs/azureml_c4bcd504b3ff56cfbcfd2bba26738797/lib/python3.11/site-packages/torch/storage.py", line 364, in _share_fdcpu
return super()._share_fdcpu(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: unable to write to file : No space left on device (28)