NBChub / bgcflow

Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)
https://github.com/NBChub/bgcflow/wiki
MIT License
27 stars 7 forks source link

Feat: Pangenome graph module #255

Open OmkarSaMo opened 1 year ago

OmkarSaMo commented 1 year ago

This is a request for a new feature rule to use PPanGGOLiN to create pagenome graphs for the genome projects.

I think we should use the Roary-defined Gene Famillies as input to ppanggolin. This would make sure the Gene Family IDs are consistent with EggNOG and other annotations.

This can be achieved by providing your gene families.

Step 1: Use gff annotations just like the roary input

ppanggolin annotate --anno ORGANISM_ANNOTATION_LIST

Step 2: Provide gene families ppanggolin cluster -p pangenome.h5 --clusters MY_CLUSTERS_FILE

MY_CLUSTER_FILE should be created from Roary output.

matinnuhamunada commented 1 year ago

Experimental feature are now available in: https://github.com/NBChub/bgcflow/tree/ppanggolin3

Usage:

OmkarSaMo commented 1 year ago

Thanks. I will try this and let you know