BiomedSciAI / fuse-med-ml

A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)
Apache License 2.0
134 stars 34 forks source link

Multiprocessing error when running KNIGHT baseline #170

Closed afoncubierta closed 2 years ago

afoncubierta commented 2 years ago

Describe the bug\ fatal error when loading KNIGHT dataset using RedHat Linux. Error does not happen in MacOS.

FuseMedML version\ commit 6a90bf3af9ca3724ae2882b717bc134ac3a930e3

Python version\ Python 3.8.13

To reproduce\ python fuse-med-ml/examples/fuse_examples/imaging/classification/knight/baseline/fuse_baseline.py

Expected behavior\ Loading and caching the dataset

Trace\

multiprocess pool created with 8 workers.
  0%|                                                                                                                                                                                                                                                                                                                                       | 0/240 [00:09<?, ?it/s]
Traceback (most recent call last):
  File "[...]/fuse/lib/python3.8/multiprocessing/pool.py", line 851, in next
    item = self._items.popleft()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "baseline/fuse_baseline.py", line 221, in <module>
    main(config_path)
  File "baseline/fuse_baseline.py", line 116, in main
    train_ds, valid_ds = KNIGHT.dataset(
  File "fuse-med-ml/fuseimg/datasets/knight.py", line 307, in dataset
    train_dataset.create()
  File "fuse-med-ml/fuse/data/datasets/dataset_default.py", line 103, in create
    self._output_sample_ids_info = self._cacher.cache_samples(self._orig_sample_ids)
  File "fuse-med-ml/fuse/data/datasets/caching/samples_cacher.py", line 183, in cache_samples
    all_ans = run_multiprocessed(
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 140, in run_multiprocessed
    ans = [x for x in iter]
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 140, in <listcomp>
    ans = [x for x in iter]
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 223, in _run_multiprocessed_as_iterator_impl
    for curr_ans in tqdm_func(
  File "[...]/fuse/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "[...]/fuse/lib/python3.8/multiprocessing/pool.py", line 856, in next
    self._cond.wait(timeout)
  File "[...]/fuse/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

KeyboardInterrupt
mosheraboh commented 2 years ago

@afoncubierta , is it still an issue? If not, can we close it?

afoncubierta commented 2 years ago

Yes, it can be closed.