pangenome / pggb

the pangenome graph builder
https://doi.org/10.1038/s41592-024-02430-3
MIT License
399 stars 44 forks source link

Mapping k-mers and contigs to pangenome graph #203

Open soungalo opened 2 years ago

soungalo commented 2 years ago

I created a pangenome graph using pggb, resulting in gfa and og files. I now wish to perform some mappings to the graph and have a few questions:

  1. From the pggb documentation I understand that the recommended way to perform mapping is using vg map. Does that mean that I have to convert the output to .vg format? How would you recommend to do that?
  2. There are two types of sequence mapping I want to perform. Can you recommend on how to do each of these (tools, parameters etc.): 1) Look for perfect matches of 31-mers, and 2) map contigs 500-10k bases long, possibly with many SNPs and InDels.

Thanks!

AndreaGuarracino commented 2 years ago

Hi @soungalo,

1) Yes, in order to use vg map, you have to go through vg's formats. You can obtain a vg format withvg convert and then obtain the indexes necessary for the mapping with vg index. Something like:

vg index -L -x input.xg input.vg
vg index -g input.gcsa -k 16 input.vg

2) Well, there are several tools for mapping stuff against the graphs. Just two examples: for short reads you can use vg giraffe, for long reads / contigs, there is GraphAligner (in the near future, also vg giraffe very likely). Assuming you did a bit deeper research meanwhile, which solutions have you found and applied?