MatthewHiggins2017 / bioconda-PrimedRPA

GNU General Public License v3.0
17 stars 22 forks source link

FileNotFoundError: [Errno 2] No such file or directory: #2

Open JamesGerardMann opened 4 years ago

JamesGerardMann commented 4 years ago

Hello Everyone,

I'm making a separate ticket for a FileNotFoundError. This was described in issue #1, however not representative of that ticket. Here is some descriptive information about my setup. I am using PrimedRPA to align 2000 sequences with an average length of 12000 bases. These sequences are then being background checked against 5000 sequences with an average length of 12000 bases.

Here is the output that Jeff provided.... I cleared my console so I do not have my copy.

*Traceback (most recent call last): File "/home/jeff/miniconda3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(args, **kwds)) File "/home/jeff/miniconda3/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar return list(itertools.starmap(args[0], args[1])) File "/home/jeff/miniconda3/bin/PrimedRPA", line 541, in IndentifyingAndFilteringOligos MaxBackgroundScoreBindingScore, MaxScoreBackSeq, HardFailBool = BlastnBackgroundCheck(NucleotideSeq, AllParameter) File "/home/jeff/miniconda3/bin/PrimedRPA", line 195, in BlastnBackgroundCheck fastadict = FastaToDict('Adv{}{}.fa'.format(seq,CleanRefID)) File "/home/jeff/miniconda3/bin/PrimedRPA", line 32, in FastaToDict with open(InputFile) as file_one: FileNotFoundError: [Errno 2] No such file or directory: 'Adv_GTTGAATATTTACTTTAGATCATAAGCGGGTTGG_AB597287.1.fa' """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/jeff/miniconda3/bin/PrimedRPA", line 1034, in CheckingAlignedOutputFile(AllParameter) File "/home/jeff/miniconda3/bin/PrimedRPA", line 810, in CheckingAlignedOutputFile PotentialPrimerProbeOut = pool.starmap(IndentifyingAndFilteringOligos,PrimerProbeCheckParallelInput) File "/home/jeff/miniconda3/lib/python3.7/multiprocessing/pool.py", line 276, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "/home/jeff/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value FileNotFoundError: [Errno 2] No such file or directory: 'Adv_GTTGAATATTTACTTTAGATCATAAGCGGGTTGG_AB597287.1.fa'**

I'm going to dig through the py file and see if I can figure out why this is occurring. I have a hunch that it's looking for .fa but the script is saving the files as blastn_input.fa

MatthewHiggins2017 commented 4 years ago

Hi James,

It appears the samtools derived file is not being generated. Could you please check that the version of samtools bioconda has installed is >=1.9?

Kind regards, Matt.

JamesGerardMann commented 4 years ago

Hello Matt,

Thank you for your swift reply. I’ve gone ahead and verified that samtools is installed via command line.

Console

$ Samtools Program samtools Version 1.9 (using htslib 1.9)

It could be that I have incorrectly installed it. I just ran $ python pip list and samtools does not show up. I will try to install it again through $ python versus $wget / make.

PS: I appreciate this script and the work you have put into it. Have you thought about adding a crRNA discovery function for use with Sherlock? (Reference Paperhttps://science.sciencemag.org/content/360/6387/444) Briefly, a function looking at the sequences between the primers and verifying that there is a 20-24 conserved site within? Starting with TTTA, TTTC, TTTG for CRISPR Cas12 or any nt for Cas13?

Sincerely, James G. Mann

From: Matt Higgins notifications@github.com Sent: Friday, May 29, 2020 12:39 PM To: MatthewHiggins2017/bioconda-PrimedRPA bioconda-PrimedRPA@noreply.github.com Cc: Mann, James James_Mann1@baylor.edu; Author author@noreply.github.com Subject: Re: [MatthewHiggins2017/bioconda-PrimedRPA] FileNotFoundError: [Errno 2] No such file or directory: (#2)

Hi James,

It appears the samtools derived file is not being generated. Could you please check that the version of samtools bioconda has installed is >=1.9?

Kind regards, Matt.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMatthewHiggins2017%2Fbioconda-PrimedRPA%2Fissues%2F2%23issuecomment-636097414&data=01%7C01%7CJames_Mann1%40Baylor.edu%7C6440a859441a4c260a8108d803f72a1e%7C22d2fb35256a459bbcf4dc23d42dc0a4%7C0&sdata=VLLFGI9KdmWllL2IHlT%2BYem%2FnBH2ziVxMXaeqdkqJK0%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAPYSKUOP3AQFHD3UO6W6TXTRT7XLBANCNFSM4NOHAFRQ&data=01%7C01%7CJames_Mann1%40Baylor.edu%7C6440a859441a4c260a8108d803f72a1e%7C22d2fb35256a459bbcf4dc23d42dc0a4%7C0&sdata=uCcZX6nai%2BNP1Jm3ElH%2BOo7uoZ6Bloa7eeKspw%2BBTX0%3D&reserved=0.

JamesGerardMann commented 4 years ago

Matt,

I've taken a look again. The program has been generating ADV_ files in my directory. However, it deletes the files afterwards. I am unsure of whats throwing the issue. Does a later part in the script call for the ADV.fasta file?

JamesGerardMann commented 4 years ago

@MatthewHiggins2017

I've looked more into it. This is randomly occurring with a large dataset. I've edited the program to show me the samtools command output, and it seems that the Adv_GTTGAATATTTACTTTAGATCATAAGCGGGTTGG_AB597287.1.fa is not being generated prior to the fasta dict call. Are the calls to fasta_dict (line 30 or so), sequential from the samtools generated output?

I was thinking about trying to fix it by replacing line 30 with a try statement so on exceptions it reruns the samtools generation command and hopefully "makes" the missing Adv file.

JamesGerardMann commented 4 years ago

Again, this is an intermittent issue. It could be that perhaps the samtools command is throwing an error. Do you know how I would go about editing the sub process to log outputs? My hunch is that potentially, the file may not be generated due to some error with said sequence.

OR

The file is failing to be generated prior to the error. By checking the samtools command inputs, (print statement with the samtools command), I did not see the requested file being generated.

Its a really weird issue as its intermittent and the dataset works if I try enough times.

On that note i'm probably overloading its capability, I'm using around 3,000 input sequences and about 3,400 background sequences.

JamesGerardMann commented 3 years ago

I've done some more testing.

I've edited the code to log every samtools command. It will run the command for the specified .fa file, however fails to pass it to line 32. I then edited line 32 to have a try statement to retry the samtools command then the code specified on line 32. It doesn't seem to work with this either. I can confirm that running the samtools command outside of primedRPA does not throw any errors and the specified file is generated.

I'm starting to think it may be an issue with multiprocessing.

ww254409392 commented 2 years ago

depend on python version and samtools version samtools upgrade to 1.15, python using 3.6.13 or 2.7 I am not sure which one, program work!