Open ste-depo opened 1 week ago
Do you mean that the whitelist allows wildcards represented by N?
It seems like. The whitelist was generated by UMI-tools (version: 1.1.1), using the command:
umi_tools whitelist --method=reads --extract-method=string --bc-pattern=CCCCCCCCCCCCCCCC -I ${SAMPLE}_R2.fastq.gz -S ${SAMPLE}_whitelist.tsv --plot-prefix=${SAMPLE} --set-cell-number=${n_cells} --subset-reads=10000000000
This is a barcode found containing Ns:
NAAAGTAGACTTAGTG NAAAGTACACTTAGTG,NAAAGTAGAATTAGTG,NAAAGTAGACATAGTG,NAAAGTAGACCTAGTG,NAAAGTAGACTCAGTG,NAAAGTAGACTGAGTG,NAAAGTAGACTTACTG,NAAAGTAGACTTAGTC,NAAAGTAGACTTAGTT,NAAAGTAGACTTATTG,NAAAGTAGACTTCGTG,NAAAGTAGACTTTGTG,NAAAGTAGAGTTAGTG,NAAAGTAGTCTTAGTG,NAAAGTCGACTTAGTG,NAAAGTGGACTTAGTG,NAAAGTTGACTTAGTG,NAAATTAGACTTAGTG,NAAGGTAGACTTAGTG,NGAAGTAGACTTAGTG,NNAAGTAGACTTAGTG,NTAAGTAGACTTAGTG 1968 1,5,1,1,2,1,1,2,5,1,1,2,1,1,1,2,1,1,1,4,1,1
I think umi_tools infer the barcode whitelist from the reads, which may contain N's (not sure about this). You may filter those barcode with Ns in your whitelist, and Chromap will try to fix those N's in the read.
Hi everybody,
Running chromap in the scATAC modality with the command:
the software raises an exception related the barcodes contained in the whitelist.txt file:
To what I have understood, the exception is related to the fact that some of the barcodes contain non ATCG letters, such as Ns. I'm telling this because removing those barcodes from the whitelist solves the issue.
Is this intended?
In fact, those barcodes are associated to a non-negligible number of reads!
Best,
Stefano