Closed nick-youngblut closed 4 years ago
I can think of two potential issues that might cause this:
time.sleep()
before you try removing the directory? It could be that you're calling Fasta.__exit__()
before the NFS has a chance to respond. pyfasta
opens and closes the fasta file during each sequence retrieval, where pyfaidx
keeps the file open for multiple operations, and only closes the file when context handler calls the __exit__
method or the Fasta
object goes out of scope. pyfasta
did not support it. Then you'll not be closing the open FASTA file in your script which would cause the problems you describe. You can either use a with statement or add a line where you explicitly call Fasta.close()
which will close the open file and then you should be able to remove the file.One other thing: I'm certain that pyfaidx is not process-safe (https://github.com/mdshw5/pyfaidx/issues/92#issuecomment-578200960). You might want to implement a multiprocessing.Lock
or change to using threading
instead, since it sounds like your use case is really IO constrained and you're not trying to do concurrent computation.
Yeah, I found #92 after posting this one. I tried it and the other solution posted on the Stack Overflow post that you seemed to have gotten your solution from. Neither worked for my situation. It turns out that I really didn't need the features offered by pyfaidx, so I just switched to pyfastx.
Oh neat - I didn’t know about pyfastx. Glad you found something that works for you.
On Aug 2, 2020, at 1:41 AM, Nick Youngblut notifications@github.com wrote:
Yeah, I found #92 after posting this one. I tried it and the other solution posted on the Stack Overflow post that you seemed to have gotten your solution from. Neither worked for my situation. It turns out that I really didn't need the features offered by pyfaidx, so I just switched to pyfastx.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I'm using pyfaidx on many bacterial genomes at the same time via multiprocessing.Pool, with all line-wrapped fasta files located in a temporary directory. When I try to remove that temporary directory at the end of the job, I get the error:
My script previously used pyfasta (I just updated to pyfaidx), so it's due to using pyfaidx. This seems to be an issue with pyfaidx & NFS, but any ideas on how to fix the issue?
OS: Ubuntu 18.04.4 python: 3.6.10 pyfaidx: 0.5.9.1