parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
102 stars 23 forks source link

Handling duplicates #80

Closed mikecuoco closed 1 year ago

mikecuoco commented 1 year ago

Hi @simoncchu thanks for writing this great program. How does xTea handle duplicates? Should users deduplicate our BAMs prior to running xTea?

simoncchu commented 1 year ago

Hi @mikecuoco, it will skip the duplicated reads labelled in the "FLAG" field in each alignment, so need to run mark duplication on the bam first, although may not affected that much if this is for germline calling.

mikecuoco commented 1 year ago

Great, thanks!