Open ingridvanw opened 2 months ago
Hi guys,
I have run pirate with the commandline: PIRATE -i /gff/ --steps 50,60,70,80,90,95,98 --features CDS --align --rplots --threads 4 --output /results/ There are around 500 gff files from Staph. aureus
PIRATE -i /gff/ --steps 50,60,70,80,90,95,98 --features CDS --align --rplots --threads 4 --output /results/
GFF file looks like:
gnl|Bactopia|SAB003_1 prokka gene 944 1540 . - . ID=SAB003_00001_gene;Name=recR;gene=recR;locus_tag=SAB003_00001 gnl|Bactopia|SAB003_1 Prodigal:002006 CDS 944 1540 . - 0 ID=SAB003_00001;Parent=SAB003_00001_gene;Name=recR;gene=recR;inference=ab initio prediction:Prodigal:002006;locus_tag=SAB003_00001;product=recombination mediator RecR;protein_id=gnl|Bactopia|SAB003_00001 .....
I was wondering why there are so many N's in the pangenome_alignment.fasta? It looks like he has put every individual isolate to the pangenome?
Hi guys,
I have run pirate with the commandline:
PIRATE -i /gff/ --steps 50,60,70,80,90,95,98 --features CDS --align --rplots --threads 4 --output /results/
There are around 500 gff files from Staph. aureusGFF file looks like:
gff-version 3
sequence-region gnl|Bactopia|SAB003_1 1 854818
sequence-region gnl|Bactopia|SAB003_2 1 492009
sequence-region gnl|Bactopia|SAB003_3 1 236753
sequence-region gnl|Bactopia|SAB003_4 1 199243
sequence-region gnl|Bactopia|SAB003_5 1 179702
sequence-region gnl|Bactopia|SAB003_6 1 140822
sequence-region gnl|Bactopia|SAB003_7 1 134421
sequence-region gnl|Bactopia|SAB003_8 1 108311
sequence-region gnl|Bactopia|SAB003_9 1 91633
sequence-region gnl|Bactopia|SAB003_10 1 79645
sequence-region gnl|Bactopia|SAB003_11 1 67889
sequence-region gnl|Bactopia|SAB003_12 1 39179
sequence-region gnl|Bactopia|SAB003_13 1 33043
sequence-region gnl|Bactopia|SAB003_14 1 29835
sequence-region gnl|Bactopia|SAB003_15 1 25963
sequence-region gnl|Bactopia|SAB003_16 1 18742
sequence-region gnl|Bactopia|SAB003_17 1 11161
sequence-region gnl|Bactopia|SAB003_18 1 2094
sequence-region gnl|Bactopia|SAB003_19 1 1649
sequence-region gnl|Bactopia|SAB003_20 1 1458
sequence-region gnl|Bactopia|SAB003_21 1 1039
sequence-region gnl|Bactopia|SAB003_22 1 501
sequence-region gnl|Bactopia|SAB003_23 1 431
sequence-region gnl|Bactopia|SAB003_24 1 410
sequence-region gnl|Bactopia|SAB003_25 1 403
sequence-region gnl|Bactopia|SAB003_26 1 360
sequence-region gnl|Bactopia|SAB003_27 1 315
sequence-region gnl|Bactopia|SAB003_28 1 310
gnl|Bactopia|SAB003_1 prokka gene 944 1540 . - . ID=SAB003_00001_gene;Name=recR;gene=recR;locus_tag=SAB003_00001 gnl|Bactopia|SAB003_1 Prodigal:002006 CDS 944 1540 . - 0 ID=SAB003_00001;Parent=SAB003_00001_gene;Name=recR;gene=recR;inference=ab initio prediction:Prodigal:002006;locus_tag=SAB003_00001;product=recombination mediator RecR;protein_id=gnl|Bactopia|SAB003_00001 .....
I was wondering why there are so many N's in the pangenome_alignment.fasta? It looks like he has put every individual isolate to the pangenome?