Closed davised closed 7 years ago
Started a run with my own data that was failing without any scripted error messages. I realized that my input files were .fna, not .fasta. I added a check to see if any input files are found in the given folder and exits instead of attempting to continue.
Currently, the script is still running and passed the point where it was dying previously.
To reproduce the error I was getting, give a directory without any .fasta files in it. I thought I had a problem with the format of the content of my input files, not the filenames.
On our system, and presumably other large compute clusters, the compute nodes share a file system that is accessible by all (over the network), and typically have a scratch space hard drive connected to each machine mapped to /data or /tmp. Since the /data and/or /tmp folders are on a different file system, using the OS link commands fails. When I use the default temp folder settings, or when specifically pointing to the /data drive, I get an OSError:
To resolve this, I imported the shutil copy function, and added a try/except statement, to fall back to the copy function when os.link fails.
New output:
You could also just do a
os.system("cp ... ...")
command to follow the formatting of the rest of your code. I like thecopy()
function as the syntax is the same as the os.link function.Additionally, I noticed that the provided stx2a nucleotide sequence, while corresponding to the STEC strain, fails when using tblastn (see notable LOG output):
LOG: 2016/12/19 14:31:11 - The following genes had no hits in datasets or are too short, values changed to 0, check names and output: stx2a
I updated the test dataset with the coding sequence of the stx2a gene so that it can properly be translated.
Output: