Closed limin321 closed 4 years ago
In the first one you are running PIRATE on all CDS (genes) in the collection and in the second you are only running on tRNA/rRNA. All the details on the relevant commands are in the README. The outputs will clearly show the difference between the outputs.
Hi, I run following two commands, both created the same number of output files.
PIRATE -i ./311gff/ -o ./311Agro_panOut_RNAcalled -f "tRNA,rRNA" -a -r -k "-f 6" -t 40
PIRATE -i ./311gff/ -o ./311Agro_panOut -a -r -k "-f 6" -t 40
The only difference is the second line of codes without -f "tRNA,rRNA". Both command create the same amount of output files, and no error message for both. However, the output file contents of the first command is weird, and it ran less than 10 mins, while the second command run around 9 hours. Here is one comparison of PIRATE.pangenome_summary.txt from both commands: First command:
215 gene families in 311 genomes.
19 contain greater than one allele at the thresholds analysed.
0 contain fission/fusion events.
0 contain duplication/loss.
%isolates #clusters >1 allele fission/fusion multicopy 0-10% 160 4 0 0 10-25% 5 1 0 0 25-50% 6 0 0 0 50-75% 5 0 0 0 75-90% 1 0 0 0 90-95% 1 1 0 0 95-100% 37 13 0 0
Second command output:
52233 gene families in 311 genomes.
3985 contain greater than one allele at the thresholds analysed.
2341 contain fission/fusion events.
886 contain duplication/loss.
%isolates #clusters >1 allele fission/fusion multicopy 0-10% 42290 1859 650 167 10-25% 3061 790 294 144 25-50% 3207 610 408 163 50-75% 1101 293 286 124 75-90% 503 162 158 100 90-95% 104 25 39 23 95-100% 1967 246 506 165
Can anyone help explain what may go wrong when adding -f "tRNA,rRNA"? Though it doesn't create error message, it doesn't make sense it finish in less than 10 mins while the second command ran 9 hours.
Thanks. Limin