timoast / sinto

Tools for single-cell data processing
https://timoast.github.io/sinto/
MIT License
112 stars 24 forks source link

Specify a whitelist when adding barcodes #61

Closed dawe closed 10 months ago

dawe commented 10 months ago

Hello, following up issue #60 I've added the possibility to specify a whitelist to the barcode predicate. This may be useful if one doesn't want to create an extra fastq file containing the corrected barcodes, hence saving some disk space (also considering the fact it is recommended to use uncompressed fastqs). If no whitelist is specified nothing changes. Otherwise one can supply either a whitelist from UMI-tools or a generic whitelist. In the first case it will be directly used to correct barcodes whenever possible, without checks. In the second case it will make use of of UMI-tools API and will correct barcodes on the fly. If correction is made on the fly, an extra time will be needed to process data, much depending on the fastq size

timoast commented 10 months ago

Thanks @dawe, this looks great!