Open ErminZ opened 1 month ago
Hi @ErminZ
Thanks for you question. I would think the 10x adapters/primers would be kept in the case of duplex reads. But I haven't tested this out yet. I will try it out and get back to you.
Thank you for your reply! I also just tested a 1 million duplex reads using the single-cell pipeline, most of the reads have 10x primers. Please let me know if you would explain more about the trimming mechanism by Dorado or wf-single-cell.
"BIOLOGICAL_duplex": {
"general": {
"n_reads": 1065534,
"rl_mean": 811.6867167073036,
"rl_std_dev": 446.9855681878009,
"n_fl": 483380,
"n_stranded": 989717
},
"strand_counts": {
"n_plus": 565033,
"n_minus": 424684
},
"detailed_config": {
"adapter1_f-adapter2_f": 278771,
"adapter1_f": 246063,
"adapter2_r": 228474,
"adapter2_r-adapter1_r": 167066,
"*": 39447,
"adapter2_f": 18974,
"adapter1_r": 12826,
"adapter2_f-adapter1_f": 12190,
"adapter1_f-adapter2_f-adapter2_r-adapter1_r": 10708,
"adapter1_f-adapter2_f-adapter2_r": 9347,
"adapter2_r-adapter1_r-adapter1_f-adapter2_f": 9240,
"adapter1_r-adapter2_r": 5760,
"adapter2_r-adapter1_r-adapter1_f": 4170,
"adapter1_f-adapter2_r-adapter2_f": 2135,
"adapter2_f-adapter2_r": 2554,
"adapter1_f-adapter2_r": 2486,
"adapter2_r-adapter2_f": 2294,
"adapter1_f-adapter1_r": 1824,
"adapter1_f-adapter1_r-adapter2_f": 1401,
"adapter2_r-adapter1_f": 1274,
"adapter2_r-adapter2_f-adapter1_r": 1158,
"adapter1_r-adapter1_f": 1237,
"adapter1_r-adapter1_f-adapter2_f-adapter2_r": 545,
"adapter2_r-adapter1_f-adapter1_r": 768,
"adapter2_f-adapter2_r-adapter1_r-adapter1_f": 740,
"adapter1_r-adapter1_f-adapter2_f": 615,
"adapter1_f-adapter2_f-adapter1_r-adapter2_r": 480,
"adapter2_r-adapter1_r-adapter2_f-adapter1_f": 651,
"adapter2_f-adapter2_r-adapter1_r": 592,
"adapter2_f-adapter1_f-adapter2_r": 173,
"adapter1_r-adapter2_f-adapter1_f": 161,
"adapter1_f-adapter2_f-adapter1_r": 131,
"adapter2_f-adapter1_r-adapter2_r": 158,
"adapter2_f-adapter1_r": 110,
"adapter2_f-adapter1_f-adapter1_r": 93,
"adapter1_r-adapter2_f": 109,
"adapter1_f-adapter1_r-adapter2_r": 45,
"adapter1_f-adapter2_r-adapter1_r": 87,
"adapter2_r-adapter2_f-adapter1_f": 120,
Is your feature related to a problem?
The duplex reads don't contain primers nor adaptors due to how duplex works, the duplex reads themselves will have the adapters and primers trimmed off. In the wf-single-cell, adapter configuration section, reads without adaptors/primers will be categorized into Others: No valid adapters found; not used in further analysis. So the hight quality duplex reads are useless in the pipeline.
Describe the solution you'd like
Add a wf-single-cell parameter that will use duplex reads that are categorized in Others in the adapter configuration section. The duplex reads have a tag
dx:i:1
in the bam file, or read id contains ";". Is there a way to keep these reads?Describe alternatives you've considered
Or ask Dorado duplex to add a function not trim primers/adapters.
Or add primer sequences manually to the duplex reads before running wf-single-cell.
Additional context
Thank you for developing such a useful pipeline that works for long-reads.