No gene annotation in gene_presence_absence.csv output

dinesh1st commented 5 years ago

Can you please advise me why I am not getting gene annotation in gene_presence_absence.csv output file? My previous experience was I got the name of all the genes and annotations in this output. However, this time Roary gave me only group numbers. The difference between these two attempts was earlier one I have used draft genomes and this time I am using complete genomes. Does it make difference?

tseemann commented 5 years ago

@dinesh1st can you paste the first 3 lines of the gene_presence_absence.csv here? eg. run head -n 3 gene_presence_absence.csv

Also, paste these 3 lines from one of your .gff input files eg. run grep CDS YOURFILE.gff | head -n 3

dinesh1st commented 5 years ago

My csv output looks like

Gene	Annotation	No. isolates	No. sequences	Avg sequences per isolate	Avg group size nuc	1	2	3	4
group_1		4	4	1		PAO1_01930(+)	PA14_03260(-)	PA34_03444(-)	VRFPA01_02925(+)
group_1000	4	4	1		PAO1_03900(+)	PA14_01208(-)	PA34_01191(-)	VRFPA01_04900(+)
group_1001	4	4	1		PAO1_03912(-)	PA14_01196(+)	PA34_01179(+)	VRFPA01_04912(-)

One of my gff file looks like

gff-version 3

sequence-region gnl|Prokka|PAO1_1 1 6264404

gnl|Prokka|PAO1_1 Prodigal:2.6 CDS 483 2027 . + 0 ID=PAO1_00001;Parent=PAO1_00001_gene;Name=dnaA;gene=dnaA;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P03004;locus_tag=PAO1_00001;product=Chromosomal replication initiator protein DnaA;protein_id=gnl|Prokka|PAO1_00001 gnl|Prokka|PAO1_1 prokka gene 483 2027 . + . ID=PAO1_00001_gene;Name=dnaA;gene=dnaA;locus_tag=PAO1_00001

andrewjpage commented 5 years ago

Could you paste the output of roary -a ?

dinesh1st commented 5 years ago

2018/10/23 22:13:17 Optional tool 'Rscript' not found in your $PATH 2018/10/23 22:13:17 Looking for 'awk' - found /usr/bin/awk 2018/10/23 22:13:17 Looking for 'bedtools' - found /Users/dinesh/anaconda2/bin/bedtools 2018/10/23 22:13:17 Determined bedtools version is 2.27 2018/10/23 22:13:17 Looking for 'blastp' - found /Users/dinesh/anaconda2/bin/blastp 2018/10/23 22:13:20 Determined blastp version is 2.7.1 2018/10/23 22:13:20 Looking for 'grep' - found /usr/bin/grep 2018/10/23 22:13:20 Optional tool 'kraken' not found in your $PATH 2018/10/23 22:13:20 Optional tool 'kraken-report' not found in your $PATH 2018/10/23 22:13:20 Looking for 'mafft' - found /Users/dinesh/anaconda2/bin/mafft 2018/10/23 22:13:20 Determined mafft version is 7.407 2018/10/23 22:13:20 Looking for 'makeblastdb' - found /Users/dinesh/anaconda2/bin/makeblastdb 2018/10/23 22:13:20 Determined makeblastdb version is 2.7.1 2018/10/23 22:13:20 Looking for 'mcl' - found /Users/dinesh/anaconda2/bin/mcl 2018/10/23 22:13:20 Determined mcl version is 14-137 2018/10/23 22:13:20 Looking for 'parallel' - found /Users/dinesh/anaconda2/bin/parallel 2018/10/23 22:13:21 Determined parallel version is 20160622 2018/10/23 22:13:21 Looking for 'prank' - found /Users/dinesh/anaconda2/bin/prank 2018/10/23 22:13:21 Looking for 'sed' - found /usr/bin/sed 2018/10/23 22:13:21 Looking for 'cd-hit' - found /Users/dinesh/anaconda2/bin/cd-hit 2018/10/23 22:13:21 Determined cd-hit version is 4.7 2018/10/23 22:13:21 Looking for 'FastTree' - found /Users/dinesh/anaconda2/bin/FastTree 2018/10/23 22:13:21 Determined FastTree version is 2.1 2018/10/23 22:13:21 Roary version 3.7.0 2018/10/23 22:13:21 Error: You need to provide at least 2 files to build a pan genome Usage: roary [options] *.gff

andrewjpage commented 5 years ago

It looks like you are running an older version of Roary (released 2 years ago). Please upgrade to the latest version and try again.

sanger-pathogens / Roary

No gene annotation in gene_presence_absence.csv output #428

gff-version 3

sequence-region gnl|Prokka|PAO1_1 1 6264404