mdshw5 / pyfaidx

Efficient pythonic random access to fasta subsequences
https://pypi.python.org/pypi/pyfaidx
Other
459 stars 75 forks source link

Device or resource busy #167

Closed nick-youngblut closed 4 years ago

nick-youngblut commented 4 years ago

I'm using pyfaidx on many bacterial genomes at the same time via multiprocessing.Pool, with all line-wrapped fasta files located in a temporary directory. When I try to remove that temporary directory at the end of the job, I get the error:

OSError: [Errno 16] Device or resource busy: '.nfs00000010882ebebe00055f32'

My script previously used pyfasta (I just updated to pyfaidx), so it's due to using pyfaidx. This seems to be an issue with pyfaidx & NFS, but any ideas on how to fix the issue?

OS: Ubuntu 18.04.4 python: 3.6.10 pyfaidx: 0.5.9.1

mdshw5 commented 4 years ago

I can think of two potential issues that might cause this:

  1. Have you tried adding a time.sleep() before you try removing the directory? It could be that you're calling Fasta.__exit__() before the NFS has a chance to respond. pyfasta opens and closes the fasta file during each sequence retrieval, where pyfaidx keeps the file open for multiple operations, and only closes the file when context handler calls the __exit__ method or the Fasta object goes out of scope.
  2. You might not be using a context handler ("with statement") in your code, since pyfasta did not support it. Then you'll not be closing the open FASTA file in your script which would cause the problems you describe. You can either use a with statement or add a line where you explicitly call Fasta.close() which will close the open file and then you should be able to remove the file.
mdshw5 commented 4 years ago

One other thing: I'm certain that pyfaidx is not process-safe (https://github.com/mdshw5/pyfaidx/issues/92#issuecomment-578200960). You might want to implement a multiprocessing.Lock or change to using threading instead, since it sounds like your use case is really IO constrained and you're not trying to do concurrent computation.

nick-youngblut commented 4 years ago

Yeah, I found #92 after posting this one. I tried it and the other solution posted on the Stack Overflow post that you seemed to have gotten your solution from. Neither worked for my situation. It turns out that I really didn't need the features offered by pyfaidx, so I just switched to pyfastx.

mdshw5 commented 4 years ago

Oh neat - I didn’t know about pyfastx. Glad you found something that works for you.

On Aug 2, 2020, at 1:41 AM, Nick Youngblut notifications@github.com wrote:

 Yeah, I found #92 after posting this one. I tried it and the other solution posted on the Stack Overflow post that you seemed to have gotten your solution from. Neither worked for my situation. It turns out that I really didn't need the features offered by pyfaidx, so I just switched to pyfastx.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.