0xTCG / biser

A fast tool for detecting and decomposing segmental duplications in genome assemblies
MIT License
43 stars 0 forks source link

how to get those information from the output of Biser ? #40

Open Shuixin-Li opened 8 months ago

Shuixin-Li commented 8 months ago

uint indelN; "number of indels" uint indelS; "indel spaces" uint alignB; "bases Aligned" uint matchB; "aligned bases that match" uint mismatchB; "aligned bases that do not match" uint transitionsB; "number of transitions" uint transversionsB; "number of transversions" float fracMatch; "fraction of matching bases" float fracMatchIndel; "fraction of matching bases with indels"

Shuixin-Li commented 8 months ago

Thank you for you wonderful tool!

But I am confused about how to get those information from biser result. I think they are not in the result table. If I need to parse CIGAR, how can I do (which tool should I use?)

Thank you very much!!

inumanag commented 8 months ago

Hi @Shuixin-Li

Sorry, I do not understand the question. All information is in the output file. There are many libraries for CIGAR parsing, such as htslib, pysam or cigar--- you will need to write a script to extract and walk BISER CIGARs in the manner you want, through.

Shuixin-Li commented 8 months ago

Thank you for your reply. for example, I really want to know the number of transitional bases, but I cannot find the answer from the output of BISER. I guess the answer may hide inside CIGAR string, but I don't know is there any tool that can help me pharse the CIGAR string.