pangenome / odgi

Optimized Dynamic Genome/Graph Implementation: understanding pangenome graphs
https://doi.org/10.1093/bioinformatics/btac308
MIT License
194 stars 39 forks source link

'odgi sort -O' may change Node ID ? #580

Closed zhangyixing3 closed 3 months ago

zhangyixing3 commented 3 months ago

Dear sir, I constracted a graph pangenome by vg aotuindex( line + vcf). I want to extact a loci gene, But I encounter an error.

$ odgi extract -i  merge.combined.giraffe.og -o path111.og -b path -c 0 -E --threads 2 -P -d 0
[odgi::extract] error: the node IDs are not compacted. Please run 'odgi sort' using -O, --optimize to optimize the graph
$ cat path
Chr01   10005432    10009242

Then I run odgi sort -i merge.combined.giraffe.og -O -o merge.combined.giraffe.sort.og -t 70, I got a subgraph successful. But some Node id were changed by odgi sort for example node 23523825 are changed after odgi sort

$ odgi position -i merge.combined.giraffe.og -g 23523825 -r Chr01
#target.graph.pos   target.path.pos dist.to.ref strand.vs.ref
23523825,0,+    Chr01,117959405,+   0   +
$ odgi position -i merge.combined.giraffe.sort.og -g 23523825 -r Chr01
#target.graph.pos   target.path.pos dist.to.ref strand.vs.ref

How can I extract a subgraph without changing the Node IDs, for example, from 10,005,432 bp to 10,009,242 bp?

zhangyixing3 commented 3 months ago

Now, I extacting subgaph successfully with run vg find -x merge.combined.giraffe.vg -N nodes.list -c 0 > path_vg.find.vg