ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

how to avoid having a cyclic graph #95

Closed khodor14 closed 2 years ago

khodor14 commented 2 years ago

I'm using seqwish to construct a graph.

I used the commnad seqwish -s genomes.fa -p alignment.paf.gz -g graph.gfa

For genomes.fa I tried with 11 ecoli genomes for the first test. For the second test I used a reference then I imposed SNPs different positions to get 10 mutated copies of the reference.

For the alignment, I'm using minimap2.

My issue is that in both cases the graph turned out to be cyclic.

I tried the option -r 1 but still getting cyclic graph.

Is there a way to avoid having cycles in the graph?

ekg commented 2 years ago

Avoiding all cycles will require that you build the graph with partial order alignment. Do you want to avoid all cycles, or are you trying to reduce their number? It's worth noting that bacterial genomes are cyclic so in a general sense it won't be possible to eliminate all cycles.

On Wed, Apr 13, 2022, 18:01 KHODOR HANNOUSH @.***> wrote:

I'm using seqwish to construct a graph.

I used the commnad seqwish -s genomes.fa -p alignment.paf.gz -g graph.gfa

For genomes.fa I tried with 11 ecoli genomes for the first test. For the second test I used a reference then I imposed SNPs different positions to get 10 mutated copies of the reference.

For the alignment, I'm using minimap2.

My issue is that in both cases the graph turned out to be cyclic.

I tried the option -r 1 but still getting cyclic graph.

Is there a way to avoid having cycles in the graph?

— Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/95, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEPR4AAQVCRMCTVHL43VE3VT3ANCNFSM5TLEQWGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

khodor14 commented 2 years ago

Thank you Erik for your quick reply.

Do you want to avoid all cycles? I'm trying to align sequences using GraphChainer which does not accept cyclic graphs for the moment. So this why I'm trying to avoid cycles.

However for the test where I introduced SNPs to the genome, i believe that it doesn't make sense to get cycles (i don't know but i will verify with acyclic genomes).

ekg commented 2 years ago

Please feel free to ping me by email. I'd be happy to discuss this further.

Another option is to use a mapper/aligner like wfmash. This is designed to find maximum length homologies, and in general is easier to configure for the graph building problem.

On Wed, Apr 13, 2022, 19:23 KHODOR HANNOUSH @.***> wrote:

Closed #95 https://github.com/ekg/seqwish/issues/95.

— Reply to this email directly, view it on GitHub https://github.com/ekg/seqwish/issues/95#event-6429131076, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEO5XESEU7PGPCLWUQTVE37JDANCNFSM5TLEQWGA . You are receiving this because you commented.Message ID: @.***>