nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

OSError: [Errno 39] Directory not empty: 'variables'. Medaka fails at medaka consensus #443

Closed imdanique closed 8 months ago

imdanique commented 1 year ago

Describe the bug Hello! It appears that medaka encounters a failure prior to reaching the final step, medaka consensus. However, when I reexecute the command, it proceeds and completes successfully. It is worth noting that this issue consistently occurs with the latest version of medaka, which is not something I recall happening previously. I am curious whether it properly generates a consensus when it resumes the interrupted run.

Logging Here is end of my log:

[11:25:42 - PWorker] 79.4% Done (0.9/1.1 Mbases) in 2675.3s
[11:25:43 - PWorker] Processed 293 batches
[11:25:43 - PWorker] All done, 0 remainder regions.
[11:25:44 - Predict] Finished processing all regions.
Traceback (most recent call last):
  File "/home/adminrig/miniconda3/envs/assembly/bin/medaka", line 11, in <module>
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/site-packages/medaka/medaka.py", line 724, in main
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/site-packages/medaka/prediction.py", line 205, in predict
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/site-packages/medaka/datastore.py", line 162, in __exit__
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/site-packages/medaka/datastore.py", line 152, in cleanup
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/contextlib.py", line 533, in close
    self.__exit__(None, None, None)
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/contextlib.py", line 525, in __exit__
    raise exc_details[1]
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/contextlib.py", line 510, in __exit__
    if cb(*exc_details):
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/tempfile.py", line 827, in __exit__
    self.cleanup()
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/tempfile.py", line 831, in cleanup
    self._rmtree(self.name)
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/tempfile.py", line 813, in _rmtree
    _shutil.rmtree(name, onerror=onerror)
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/shutil.py", line 718, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/shutil.py", line 655, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/shutil.py", line 659, in _rmtree_safe_fd
    onerror(os.rmdir, fullname, sys.exc_info())
  File "/home/adminrig/miniconda3/envs/assembly/lib/python3.8/shutil.py", line 657, in _rmtree_safe_fd
    os.rmdir(entry.name, dir_fd=topfd)
OSError: [Errno 39] Directory not empty: 'variables'
Failed to run medaka consensus.

Environment:

homeveg commented 11 months ago

I just have very same problem:

[01:07:55 - PWorker] Processed 11881 batches
[01:07:55 - PWorker] All done, 0 remainder regions.
[01:07:56 - Predict] Finished processing all regions.
Traceback (most recent call last):
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/bin/medaka", line 11, in <module>
    sys.exit(main())
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/medaka.py", line 724, in main
    args.func(args)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/prediction.py", line 125, in predict
    with medaka.models.open_model(args.model) as model_store:
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/datastore.py", line 162, in __exit__
    self.cleanup()
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/datastore.py", line 152, in cleanup
    self._exitstack.close()
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/contextlib.py", line 584, in close
    self.__exit__(None, None, None)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/contextlib.py", line 576, in __exit__
    raise exc_details[1]
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/contextlib.py", line 561, in __exit__
    if cb(*exc_details):
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/tempfile.py", line 869, in __exit__
    self.cleanup()
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/tempfile.py", line 873, in cleanup
    self._rmtree(self.name, ignore_errors=self._ignore_cleanup_errors)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/tempfile.py", line 855, in _rmtree
    _shutil.rmtree(name, onerror=onerror)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/shutil.py", line 724, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/shutil.py", line 657, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/shutil.py", line 663, in _rmtree_safe_fd
    onerror(os.rmdir, fullname, sys.exc_info())
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/shutil.py", line 661, in _rmtree_safe_fd
    os.rmdir(entry.name, dir_fd=topfd)
OSError: [Errno 39] Directory not empty: 'variables'
Failed to run medaka consensus.

Environment:

homeveg commented 11 months ago

Describe the bug Hello! It appears that medaka encounters a failure prior to reaching the final step, medaka consensus. However, when I reexecute the command, it proceeds and completes successfully. It is worth noting that this issue consistently occurs with the latest version of medaka, which is not something I recall happening previously. I am curious whether it properly generates a consensus when it resumes the interrupted run.

When I restarted the Meadka with the very same parameters I've got first everything running, picking up approximately from the place it failed, but then failed with the error:

[11:09:22 - DataIndx] Loaded 1/1 (100.00%) sample files.
[11:09:22 - DataIndx] Loaded 1/1 (100.00%) sample files.
[11:09:22 - DataIndx] Loaded 1/1 (100.00%) sample files.
[E::fai_retrieve] Failed to retrieve block: unexpected end of file
Traceback (most recent call last):
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/bin/medaka", line 11, in <module>
    sys.exit(main())
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/medaka.py", line 724, in main
    args.func(args)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/stitch.py", line 229, in stitch
    contigs, gt = fill_gaps(contigs, args.draft, args.fill_char)
  File "/fungen/funhome/Software/miniconda3/envs/medaka_env/lib/python3.10/site-packages/medaka/stitch.py", line 128, in fill_gaps
    draft_seq = draft.fetch(ref_name)
  File "pysam/libcfaidx.pyx", line 315, in pysam.libcfaidx.FastaFile.fetch
BlockingIOError: [Errno 11] b'Resource temporarily unavailable'
Failed to stitch consensus chunks.
alexweisberg commented 10 months ago

I am getting the same error as everyone else ("OSError: [Errno 39] Directory not empty: 'variables'"). Running the conda version on centos linux.

cjw85 commented 10 months ago

Duplicate of #429.

yanhui09 commented 10 months ago

I had the same issue on HPC, but Medaka works fine on my own server. Any walkaround solution for HPC use?

cjw85 commented 10 months ago

I'm not entirely sure why this is occurring for users with such frequency. We routinely run medaka on HPC systems with shared file systems and do not see similar errors.

What is clear is that there are circumstances under which the code is unable to remove temporary files that it writes to the system temporary file location. I have added a workaround to catch and ignore such errors.

cjw85 commented 8 months ago

The workaround to this issue was applied in v1.10.0.