SegataLab / panphlan

PanPhlAn is a strain-level metagenomic profiling tool for identifying the gene composition of individual strains in metagenomic samples
MIT License
43 stars 6 forks source link

Index for manual {species}_pangenome.tsv #37

Open Hocnonsense opened 2 years ago

Hocnonsense commented 2 years ago

https://github.com/SegataLab/panphlan/blob/a7846e4ada410fe791ded6297551be96883a7048/panphlan_map.py#L361-L362

Hello and thanks for your code. I'me tring to apply this software to my metagenome analysis, and I've already annote all my MAGs with prodigal followed by mmseqs and eggNOG. So I'd like to generate a {species}_pangenome.tsv table without PanPhlAn_pangenome_exporter. However, I'me not sure the style of genome start and stop in this table. For example, in gff table, a gene "starts at 3 and ends at 11" refer to a 6 bp sequence:

>contig_1
AATCGTCGTCGA
  ^       ^

>contig1|gene1
TCGTCGTCG

However, what does the number in "start and stop" of {species}_pangenome.tsv should be?, can I just add the raw number from gff file to this table? Thanks for your advices!