ARUP-NGS / BMFtools

Barcoded Molecular Families
MIT License
22 stars 8 forks source link

After rescue, trim all N'd bases from the ends of reads #79

Closed dnbaker closed 8 years ago

dnbaker commented 8 years ago

Reads are masked rather than trimmed by cutadapt before alignment for implementation efficiency in both rescue and initial demultiplexing. In addition, for variable-length barcodes, this adds an additional check to ensure that each read pair being considered for collapse has identical length for each of its barcodes.

However, for downstream analysis, especially for structural variants, excessive soft-clipping complicates analysis and can lead to false positives.

TODO: add n-trimming functionality as postprocessing in bmftools rsq before writing out to either file or fastq.

dnbaker commented 8 years ago

Maskripper provides this functionality.