Closed ktpolanski closed 2 months ago
TRUST4 has not been optimized for the long read yet. It just supports long read length, but the results may be suboptimal. For example, if there are too many indel sequencing errors, TRUST4 may not handle those well.
Since the long-read may contain many sequences before V-genes and after C gene (like the poly-A tails you found), maybe you can add the option "--repseq", which will aggressively trim the sequences out of the VDJ region in a read.
Hope this helps.
That is actually super convenient, just in case some unforeseen garbage sneaks into the reads outside of the adapters I expect and will be actively looking for. Thanks for the heads up!
Hello,
It would appear that there are some protocols out there that combine 10X barcoding with long reads. Their analysis workflow has them pre-filter the reads to get rid of failed UMIs, then process everything via MiXCR, and apply the UMI information in post-processing by ditching UMIs where reads were assigned to different clonotypes.
Given TRUST4's innate support of CB+UMI information, it would form the basis of a more elegant processing of the data. I fully expect that it should be quite straightforward - prior issues mention TRUST4 working well with PacBio data, and I can prepare split reads with CB+UMI information pulled out and stored separately.
Just a couple quick questions:
Thanks a lot and sorry for the trouble!