OpenGene / gencore

Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
MIT License
115 stars 31 forks source link

gencore don't generate consensus read from two pair reads with reciprocal UMI ,such as UMI ATGC_GCAA and GCAA_ATGC #36

Open litun-fkby opened 3 years ago

litun-fkby commented 3 years ago

I use the gencore to deal with bam file which has the umi info ,but the output file in some position inconsensus with the description. the reads pair1 has the umi TGT_CGA and the position chr17:7578090-7578182: image

the reads pair2 has the umi CGA_TGT and the same position chr17:7578090-7578182: image but the reads pair1 and pair2 don't generate one consensus reads pair. the reads pair3 and pair4 has the same situation, as follow: image image how can i generate one consensus reads pair

mmokrejs commented 8 months ago

I would also appreciate if just the R1 and R2 read-pair members were merged into a single one while Ns and other discrepancies being resolved (higher QUAL be preferred but could be configurable). I only want to merge them based on SAM alignment info, not by any k-mer based overlapping approach, for which there are many other tools. I want an alignment-based solution to find the overlapping region.