Closed sabrinadiemert closed 6 years ago
Hi @sabrinadiemert
I think I may know what's going on. The first 2 FASTA file inputs are being treated as a FASTA file path and genome name pair with the -i
option where genome1.fa
is the file path and genome2.fa
is the genome name. The rest of the input files are treated as leftover arguments and treated as normal input files.
The -i
option is useful for when you have FASTA files that don't have the desired genome name encoded in the filename:
-i fasta_path genome_name, --input-fasta-genome-name fasta_path genome_name
fasta file path to genome name pair
You could try running the command without the -i
option:
sistr -f tab -o SISTR_output.tab *.fa
sistr_cmd
should work with Python 3.6 if it's installed into a clean conda env. If you happen to have the error message or stacktrace you received when trying to run it with Python 3.6, then that would help me figure out what the issue might be.
Hope that helps!
Aha! Thanks @peterk87, that definitely solved the problem. Thanks to that explanation, I can see that I misinterpreted the description of the -i
flag.
Here's the error that I received when running in my conda env with Python 3.6:
Traceback (most recent call last):
File "/home/sabrina/miniconda3/envs/bioinfo/bin/sistr", line 11, in <module>
load_entry_point('sistr-cmd==1.0.2', 'console_scripts', 'sistr')()
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/sistr_cmd.py", line 324, in main
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/sistr_cmd.py", line 324, in <listcomp>
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/sistr_cmd.py", line 194, in sistr_predict
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/src/cgmlst/__init__.py", line 342, in run_cgmlst
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/src/cgmlst/__init__.py", line 134, in get_allele_sequences
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/src/cgmlst/msa.py", line 58, in msa_ref_vs_novel
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/site-packages/sistr_cmd-1.0.2-py3.6.egg/sistr/src/cgmlst/msa.py", line 37, in msa_mafft
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "/home/sabrina/miniconda3/envs/bioinfo/lib/python3.6/subprocess.py", line 1344, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'mafft': 'mafft'
(bioinfo)
I think this is because ete3
's mafft installation, because I noticed this issue right after that installation, although not positive. At any rate, I reinstalled sistr_cmd
into a clean Python 3.6 conda env and it's working fine.
That's great that it's working now!
I'll make the usage -i
option clearer in the next version and in the docs.
Thanks for the info about ete3
's mafft
not playing nice with the version sistr_cmd
needs. That's definitely something to keep in mind as development continues.
Hi @peterk87,
I'm having a strange problem running sistr_cmd. It seems like all of the output .tab files that are produced through sistr_cmd are missing the first genome that is assessed. For example, if I run the following command within a folder containing three genomes (genome1.fa, genome2.fa, and genome3.fa):
... the SISTR_output.tab file only has two row entries. Even weirder, it seems to be combining the first two genomes it encounters (at least, as far as I can tell from two of the columns in SISTR_output.tab):
Any idea what might be happening here? I'm running this in Linux (Ubuntu 17.04) within a conda environment with Python 2.7; I previously ran sistr_cmd in a separate conda environment with Python 3.6 but it seemed like I was having some package interference with ETE3. Looking back over my results from those runs, this problem was happening at that time, too.