Closed prmunn closed 9 months ago
This should have been fixed several releases ago, can you give the exact error that you have?
The regex pattern can include Xs, but how that is achevied depends on what you want to do with the XXs. Do you want to keep or discard the bases that match the Xs?
I'm running version 1.1.2 - is it fixed in that version? I would like to keep the bases that match the X's
Here is the command I'm running and the resulting error: umi_tools extract --extract-method=string \ CCCC> -p XXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXXXXXXXXXXXXXXXNNNNNNNN \
--filtered-out=sciRNA-10K_extract_filtered_out.txt \ --filtered-out2=sciRNA-10K_extract_filtered_out2.txt \ --error-correct-cell \ --quality-filter-mask=20 \ --quality-encoding=phred33 \ --whitelist=sciRNA-10K_predictedBCwhitelist.txt \ -I sciRNA-10K_whitelist_out_R2.fastq \ -S sciRNA-10K_hBC_UMI_R2.fastq.gz \ --read2-in=sciRNA-10K_whitelist_out_R1.fastq \ --read2-out=sciRNA-10K_hBC_UMI_R1.fastq.gz \ -L sciRNA-10K_extractBC.log Traceback (most recent call last): File "/programs/UMI-tools/bin/umi_tools", line 8, in
sys.exit(main()) File "/programs/UMI-tools/lib64/python3.9/site-packages/umi_tools/umi_tools.py", line 61, in main module.main(sys.argv) File "/programs/UMI-tools/lib64/python3.9/site-packages/umi_tools/extract.py", line 314, in main whitelist is None): NameError: name 'whitelist' is not defined
This particular problem was fixed in 1.1.3. I recommend you update.
Any to specify the barcode in regex so as to keep the Xs you could use:
XXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXXXXXXXXXXXXXXXNNNNNNNN
'^...(?P<cell_1>.{12})...(?P<cell_2>.{12})...(?P<cell_3>.{12}).{17}(?P<umi_1>.{8})'
Thanks for your quick reply. I'll upgrade to the latest version and try the regex you suggested.
I have an issue similar to #509 where I need to use the regex option when also using a whitelist. However, my BC pattern is XXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXCCCCCCCCCCCCXXXXXXXXXXXXXXXXXNNNNNNNN and I'm not sure what the regex is for this (previously, I've only seen regex patterns for N's and C's). Is there a regex pattern that can also include X's, or alternatively, is the a way to pass in the pattern as a string? (currently the string option does not appear to work with a whitelist).