marschall-lab / panacus

Panacus is a tool for computing statistics for GFA-formatted pangenome graphs
MIT License
73 stars 4 forks source link

Validity of results #32

Open aaaniich opened 1 month ago

aaaniich commented 1 month ago

Good afternoon! I'm sorry to bother you. Could you please advise me what could supposedly be wrong with my results? I saw a thread above with a similar case. But I'm confused by the fact that my results are so different between bp and edges. Could this be due to the fact that I didn't use any masking for the satellite sequence? 12_hap_hg38 gfa_growth_edge (2) 12_hap_hg38 gfa_growth_node (1) 12_hap_hg38 gfa_growth_bp (1)

danydoerr commented 3 weeks ago

Dear Anna, apologies for the late response--I have been on vacation. Edge and bp count have different properties, and that's why they also exhibit different growth curves. For Instance, it is expected that the core genome (coverage >= 1, quorum = 100%) of edges tends converges towards 0, as edges that are shared between all genomes should not exist, unless in presence of segmental duplication.

From the figures alone, I cannot spot any obvious mistake of your analysis.