ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

Positions and consensus sequences of alternative SV alleles #165

Closed JYLeeBioinfo closed 4 years ago

JYLeeBioinfo commented 4 years ago

Hi

How can I extract the consensus sequences and the positions of the alternative paths resulting from the presence of heterozygous structural variations? [ positions - relative to final contigs(.ctg.fa) ]

Referring to your answers to a previous issue (#64 ), I checked the .frg.nodes file and also prefix>.frg.dot.gz but I am not familiar with these file formats and I am not sure how I can extract that information from these files.

The information I want to extract is something like this.

#ctg_id start   end alt_id          consensus_sequence
ctg1    1000    1001    ctg1_alt1_ins       AAATTTGGG
ctg1    1500    2000    ctg1_alt2_del       .
ctg2    1000    1100    ctg2_alt1_indel     ATTGGTTAAGGATAG

Could you help me with this?

ruanjue commented 4 years ago

prefix.*.dot is written in grapviz dot language, please check it. About the SV detection, I am afraid I have little time to write the codes recently or next year.