reveal bubbles - variant wrongly (not) traversed

jasperlinthorst / reveal

Graph based multi genome aligner

MIT License

46 stars 3 forks source link

reveal bubbles - variant wrongly (not) traversed #12

Closed ChriKub closed 7 years ago

ChriKub commented 7 years ago

Hi, I ran into a problem using reveal bubbles. I constructed a very easy test case with two sequences shared between the two input sequences and a SNP in between them in one sequence.

Seq1 - SNP - Seq2 Seq1 - Seq2

The reveal alignment resolves this SNP as expected

P Testset_SNP.fasta 1+,2+,3+ 0M.0M.0M P Testset_noSNP.fasta 1+,3+ 0M.0M

But when using reveal bubbles on the gfa file (no if I set a reference or which of the two) the bubble is detected, but the variant is not traversed by either path.

source sink subgraph ref pos variant Testset_SNP.fasta Testset_noSNP.fasta

1 3 1,2,3 TestSet1SNP.gfa 2984 A,- - -

Thanks, Chris

jasperlinthorst commented 7 years ago

Hi Chris, Can you send me the actual files, because it's not clear to me what you're trying to do. It seems to me that you're aligning two sequences, for which you simply inserted an A in one of them (so not a SNP). And then aligned those into a graph called 'TestSet1SNP.gfa' and 'reveal bubbles' calls the indel...

I do see that the actual sample information about the call is empty (the last two colums), which shouldn't be the case (unless you use the --nometa parameter?). Also, about you bonus question, this '.' with me is a ',' (comma).

Cheers, Jasper

ChriKub commented 7 years ago

Hi Jasper, not using --nometa fixed it for me, although it is not clear for me why this should influence the bubble calls as the bubble is clearly defined by the paths. Concerning the dot: It ended up there during my own post processing after the bubble calling, so nothing reveal related.

Thanks, Chris

jasperlinthorst commented 7 years ago

I use the meta data (the TAGs after each node) that I define on the nodes to make the call. I only output the paths to comply with the GFA standard and other tools that depend on this information, but I don't use or parse them when loading the graph (yet). It should be a small thing to add this in case you need it, I'm just a bit busy at the moment..

ChriKub commented 7 years ago

Hi, I see. I remove the meta data during further processing as it is not recognized by one of my downstream tools. Now that I know that the meta data is important for the bubble caller I will just keep them until I've called my bubbles and discard them afterwards. Thanks