Closed robinycfang closed 9 months ago
I'm guessing you are getting this error because your regexs have two different options in the them, and in one option you have a <umi_2>
group, when there has not been a <umi_1>
group collected.
I think for your use case, you can use something much simpler:
$ umi_tools extract --extract-method=regex \
--bc-pattern="(?P<umi_1>^[ACGT]{4})" \
--bc-pattern2="(?P<umi_1>^[ACGT]{4})" \
-I test_R1.fastq.gz \
--read2-in=test_R2.fastq.gz \
--stdout=processed.1.fastq.gz \
--read2-out=processed.2.fastq.gz \
--log=processed.log
or even:
$ umi_tools extract --extract-method=string \
--bc-pattern=NNNN \
--bc-pattern2=NNNN \
-I test_R1.fastq.gz \
--read2-in=test_R2.fastq.gz \
--stdout=processed.1.fastq.gz \
--read2-out=processed.2.fastq.gz \
--log=processed.log
should work.
@robinycfang - Closing now due to inactivity
Hi
I have PE reads, with UMIs on both 5' and 3' ends of both reads.
umi_tools extract --extract-method=regex \ --bc-pattern="((?P<umi_1>^[ACGT]{3}[ACG])(?P<discard_1>T))|(?P<umi_2>^[ACGT]{3})" \ --bc-pattern2="((?P<umi_1>^[ACGT]{3}[ACG])(?P<discard_1>T))|(?P<umi_2>^[ACGT]{3})" \ -I test_R1.fastq.gz \ --read2-in=test_R2.fastq.gz \ --stdout=processed.1.fastq.gz \ --read2-out=processed.2.fastq.gz \ --log=processed.log
but it gave meTypeError: can only concatenate str (not "NoneType") to str
Any comments would be appreciated!