galaxyproject / dunovo

Reference-free duplex sequencing pipeline.
Other
18 stars 6 forks source link

More intelligently identify barcode sequence, based on 5bp linker after it. #19

Open NickSto opened 6 years ago

NickSto commented 6 years ago

It would be great if make-families.sh would identify where the barcode ends based on detecting the constant 5bp sequence that follows it.

It can't rely on an exact match, though, since that would throw out all reads with errors in those 5bp.

This would also help in cases where there was an error in the protocol which resulted in invalid barcodes, since this feature could discard reads without an identifiable constant sequence.