Using PanGenie for non-directed acyclic graph and pair-end reads

eblerjana / pangenie

Pangenome-based genome inference

MIT License

103 stars 10 forks source link

In principle PanGenie can genotype everything that is represented in terms of a VCF with mulit-allelic, non-overlapping records. See: https://github.com/eblerjana/pangenie#required-input-files for details on the input files. So I guess in case of inversions/translocations, this would produce large bubble regions when expressed in VCF which might be tricky to genotype. Also, especially inversions are hard to genotype based on kmers, because the kmer set does not change for sequence inside of inversions. We never really evaluated the performance specifically for these events, but my expectation is that the performance is probably worse than for other SVs.
PanGenie works in kmer space only, so we cannot distinguish between a kmer and its reverse complement. Therefore, PanGenie counts canonical kmers (jellyfish is run with-C switch), which means a kmer and its reverse complement are treated as equivalent (see: https://github.com/gmarcais/Jellyfish/tree/master/doc#counting-k-mers-in-sequencing-reads for details)

eblerjana / pangenie