cdanielmachado / carveme

CarveMe: genome-scale metabolic model reconstruction
Other
149 stars 51 forks source link

observed issues with omission of '--dna ,-o' option in command #33

Closed rdmtinez closed 5 years ago

rdmtinez commented 5 years ago

Greetings,

I noticed recently that when failing or incorrectly denoting the options '-o' '--dna', like: (--dna parameter omitted) carve -r -v -d ./assemblies/INPUT.fna -o ./carve_output/output.xml

I obtain an output named INPUT.tsv file with information like

contig_4 iSSON_1240.SSON_1864 60.3 884 343 6 137418 140063 12 889 1.3e-307 1061.6 contig_4 iECH74115_1262.ECH74115_1910 60.0 883 345 6 137421 140063 13 889 5.1e-307 1059.7

and a folder named "output.xml", what values does this output represent?

The following error was also observed (actual name of files below):

Running diamond...
diamond blastp -d /home/martinez/anaconda3/lib/python3.6/site-packages/carveme/data/input/bigg_proteins.dmnd -q ./assemblies/LjNodule210.fna -o ./assemblies/LjNodule210.tsv --more-sensitive --top 10
Failed to run diamond.

The following (-o paremeter omitted):

carve -r -v -d --dna ./assemblies/INPUT.fna ./carve_output/output.xml

produces both a INPUT.tsv file and INPUT.xml file but not an output.xml file (i.e. name is uchanged) unsure about the actual output itself, as i haven't checked the paremeters effect on the actual output content, but the following error was observed:

Error: Error opening file ./carve_output/without_o.xml
Running diamond...
diamond blastx -d /home/martinez/anaconda3/lib/python3.6/site-packages/carveme/data/input/bigg_proteins.dmnd -q ./assemblies/LjNodule211.fna -o ./assemblies/LjNodule211.tsv --more-sensitive --top 10
Loading universe model...
Scoring reactions...
Reconstructing a single model
Running diamond...
diamond blastx -d /home/martinez/anaconda3/lib/python3.6/site-packages/carveme/data/input/bigg_proteins.dmnd -q ./carve_output/without_o.xml -o ./carve_output/without_o.tsv --more-sensitive --top 10
Failed to run diamond.
Done.

The following command (both parameters omitted):

carve -r -v -d ./assemblies/INPUT.fna ./carve_output/output.xml

produces only a INPUT.tsv file and the following error:

Error: Error opening file ./carve_output/without_both.xml
Running diamond...
diamond blastp -d /home/martinez/anaconda3/lib/python3.6/site-packages/carveme/data/input/bigg_proteins.dmnd -q ./carve_output/without_both.xml -o ./carve_output/without_both.tsv --more-sensitive --top 10
Failed to run diamond.
Running diamond...
diamond blastp -d /home/martinez/anaconda3/lib/python3.6/site-packages/carveme/data/input/bigg_proteins.dmnd -q ./assemblies/LjNodule212.fna -o ./assemblies/LjNodule212.tsv --more-sensitive --top 10
Failed to run diamond.

The solution is definitely to state the options in the command, but I figured someone could overcome a headache by knowing this much ;)

Note none of the outputs go to the actual "carve_output" folder,

cdanielmachado commented 5 years ago

Hi @rdmtinez,

Thanks for the detailed description. Why are you using the -r option with a single genome file?

Would you mind just trying again without the -r and see if the problem persists ?

Also, if you are using .fna files (i.e. nucleotide sequences) you must always use the --dna option.

rdmtinez commented 5 years ago

Greetings @cdanielmachado,

I used -r because it was a batch process I was running when I noticed the errors and forgot to delete it during my post--just a typo really.

-Ricardo

cdanielmachado commented 5 years ago

Hi Ricardo,

I am not sure if I understand what is the issue you are trying to submit here...

If I understand correctly, you are saying if the correct options are not provided, things do not work as expected.

If you give the program nucleotide sequences without the --dna flag it fails.

If you do not give the -o option, things don't go into the expected output folder.

Is that correct?

rdmtinez commented 5 years ago

That exactly right @cdanielmachado ... I had some typos in my code and then I realized these things so I decided to post about it. It was just surprising to see the "output" being a folder with the given name rather than some file when '-o' is omitted and decided to mention it.

cdanielmachado commented 5 years ago

Ok, I think I understand it a bit better now.

All arguments which are not preceded by a flag are considered to be input files. Therefore, when you omit the -o flag, it will take your second argument, and assume it is a second genome file.

If you don't use the -r option, you get the following error:

carve: error: Use -r when specifying more than one input file

But because you used -r, it assumed you were giving it multiple genome files to run in parallel. Diamond will receive your output folder as a genome file to process, and it will fail.

I think there is really nothing that can be done here. It is normal for command line tools to fail when you don't specify the arguments correctly.