pangenome / odgi

Optimized Dynamic Genome/Graph Implementation: understanding pangenome graphs
https://doi.org/10.1093/bioinformatics/btac308
MIT License
191 stars 39 forks source link

exporting all possible closed circular path in fasta from a gfa file #563

Open splaisan opened 5 months ago

splaisan commented 5 months ago

Hi

I am new to graph analysis and was sent here by Erik Garrison (https://github.com/GFA-spec/GFA-spec/issues/123)

Could someone guide me on which odgi command(s) to use to achieve my goal of generating all possible closed circular paths (in fasta) format from a gfa graph obtained from Flye.

Manually I can create the four models from this gfa but I would like to automate the process for future pipelines

Here is the truncated gfa content (cut at 40 char-)

thanks in advance

H   VN:Z:1.0
S   1   CTATCTTGGTTCCACAAATCTCATTACACCAATATT
S   2   AAGTGGGAGTGGATTAACAGAAATGGCCCCGTACGG
S   3   AAGTGGGGGTGGAGTAAGAGAAATTGCCCCGTACTG
S   4   TAATTGAGTTCCGTGTTCCGGGCAGCACCACCACTG
S   5   GCACTCGAACGACGAAGTAAAGAACGCGAAAAAGCG
S   6   CACTCTGACAATTCGTTGATCAAGTCACGGTATTTA
L   1   +   6   -   0M  RC:i:37
L   1   -   5   +   0M  RC:i:43
L   2   +   6   +   0M  RC:i:96
L   2   -   5   -   0M  RC:i:91
L   3   +   6   +   0M  RC:i:60
L   3   -   5   -   0M  RC:i:58
L   4   -   5   +   0M  RC:i:108
L   4   +   6   -   0M  RC:i:115
P   contig_5    5+  *
P   contig_6    6+  *
P   contig_1    1+  *
P   contig_2    2+  *
P   contig_3    3+  *
P   contig_4    4+  *

312413118-78b5ec71-9fd6-4ec0-ab77-1e3f8563dc81

sivico26 commented 5 months ago

@splaisan, have you tried the get_organelle_from_assembly.py script from GetOrganelle? I think it does what you want.