Closed xiaohe0404 closed 1 year ago
Hi @xiaohe0404, do you want to start with trimmed fastq files rather than raw sequencing files? Extracting the UMI sequence in the cutadapt step is essential for the downstream analysis. I am not sure if your processed files still fit this pipeline. Could you provide more details about how you trim the fastq files?
Thanks for your timely reply! Here are my detailed parameters:
fastp can trimming the adapter in your sample, but the output format (UMI_NNNNNN
) is not compatible with this pipeline.
Suppose you are using the template switch with dual UMI strategy for your library construction, it is highly recommended that you can run this pipeline with barcode: NNNNNNXXX-XXXNNNNNN
[^1][^2] setting. No additional settings need to be modified.
[^1]: XXX
after the -
symbol is for trimming mismatch tail at the 3'. For your description, you might use the random RT method, which would also create mismatches at the 3' end of the reads.
[^2]: NNNNNN
at the end is for extracting "3' barcode" you mentioned in step 3. If you do not need this sequence, replace NNNNNN
with XXXXXX
would help.
Thanks for your reply, this is really helpful!
You are welcome. If you have any question about this pipeline, do not hesitate to raise new issues.
I'm sorry to bother you. Can you provide your detailed parameters of STAR? And I'm still confused about setting parameters of barcode. Because I have already cut the UMI sequence and barcodes. Do I still need to write "barcode: '-NNNNN'" in data.yaml? Looking forward to your reply! I will really really really appreciate you!! Best wishes