marcelm / cutadapt

Cutadapt removes adapter sequences from sequencing reads
https://cutadapt.readthedocs.io
MIT License
514 stars 129 forks source link

Add Reverse Complement modification function. #737

Open galaxy001 opened 11 months ago

galaxy001 commented 11 months ago

Some barcode sequencing methods produce barcode fastq file in reverse complement form, thus one of the paired fastq files need to be reverse complement back.

It will be convenient if cutadapt can offer this function.

marcelm commented 11 months ago

Hi, I’m not sure I understand what format the data is in. Can you describe what exactly is in R1 and R2 or do you have a link?

Would it be enough to be able to search for the reverse-complemented barcodes?

galaxy001 commented 10 months ago

The Read 1 Fastq:

@Pro31:S:PRM32402100029:1:000000:R001:C001 1:L:0
TAGATGGCTCATCCGTATCGCGGGTCACCGATGACAGATG
+
CCCC@CC@AAA@DCC@BB@CD@CC@C@CCCDCC@CB=BBB
@Pro31:S:PRM32402100029:1:000001:R001:C001 1:L:0
TTCAAGTCGGGATTCGGGTTGAGCCATTTTACCAGGGGAA
+
@@@@CB@CDBDC@@D@@B@@@DDCCB@@@DC@B@@BBC@C

The Read 2 Fastq:

@Pro31:S:PRM32402100029:1:000000:R001:C001 2:L:0
TCTTCCTGGATTTTTTTAAATCATTTTTATCTCAGAACTTAAACAAAAATTAGATGTCGTGCACGGACTGTGTGAAAGAAGATGCTTTGCATATTTGCTG
+
A@@BABDBEDBBCCBC@BBCCA@@B@CD?CACABDDCBBCCACABACABC@@C;ACBCA?ADBBB@@DBD?B@CD@BBC@@BC@>B@B<BB@FCCAC@BB
@Pro31:S:PRM32402100029:1:000001:R001:C001 2:L:0
GGGTGACACAGCAACACCTCGTCACAAAAAAAAAAAAAAAAAAAAAAAAAATTCCCCTGGTGCGAGAGTCGGGGGGGTTGTAGTAACAAGATATGCCCCC
+
@EBBC@D@BCBB@CDCCBCC@@C@CB@D@CB@AD@ACFCDDBA@BD@@@?+A9D@@<,=&$1$,7*$-739'B@$'',21&@/'$:,<8**9%=.6,8D0

Read 1 come with Read Structure 30B10M, but the first 30bp have to be reverse-complemented for downstream programme like STAR. Those reverse-complemented 30 bp will be used as spot barcode.

Read 2 is ordinary RNASeq data from the spot.

marcelm commented 10 months ago

Thanks, this is helpful. I will probably not be able to add something like this right now, but let’s leave the issue open as a reminder.