gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
259 stars 33 forks source link

Genes present in prokka gff files but present as group_xxx in gene_presence_absence_file #228

Closed simrangambhir closed 1 year ago

simrangambhir commented 1 year ago

Hi, I have annotated a number of bacterial genomes with Prokka, which contain the respective genes in the gff files. However, after pangenome construction done with Panaroo at default threshold, the same genes are present as group_xxx in the gene_presence_absence file generated by Panaroo. It will be great if we could find a way to prevent this. Thanks Simran

gtonkinhill commented 1 year ago

Hi,

Do you mean that you wish the group_xxx was replaced with the gene name? Panaroo currently labels clusters as 'group' if the same gene name is duplicated in multiple different clusters. This is the approach that Roary takes.

simrangambhir commented 1 year ago

Thank you for your response!