Open LukaP-BB opened 6 months ago
Currently, we don't support parsing the barcode in the header. You can extract the raw barcode into another fasta file, like
>A01789:135:HLKCJDMXY:1:1101:1027:1047
CGTGCCTATTCGGACAGT
I will add the feature to parse from the header in the next or next next release.
Thanks for your swift reply, the solution seemed to work as TRUST4 is now running.
This is a tangent to the original issue, but do you have a recommendation for the number of threads to use ? I launched a test run on 1 thread but it is taking >24 hours to complete on my data. Is the relationship between n_threads and speed linear ?
I usually use 8 threads. I think the gain probably plateaus after 16 threads. Which step do you find TRUST4 stuck on? Which version of TRUST4 are you using?
Hi, I'm running trust4 V1.0.5.1 according to conda. I tried again with 20 threads just to be sure to overshoot, and it got quite slow at the same step, where it displays in the logs [Sat Jun 8 08:55:39 2024] Processed 32600000 reads (30149746 are used for assembly)
then got timeout after 2 days.
My data is probably not appropriate as it is, since R1 and R2 fastq.gz are ~27G each, and most of the data within will not be IGH reads. If I align beforehand and provide bam files to TRUST4, I guess it will be able to focus on the IG regions more efficiently ? I originally wanted to avoid doing the alignment myself since most of the workflow is outsourced.
Is it possible to upgrade to the recent version of v1.1.1? The speed on barcode-based data has been improved much since v1.1.0.
I'll try and get to you after I tested it, I assumed naïvely that conda installed the latest version. Thanks for your help ! :heart:
I have fastq files for scDNA with barcodes extracted in the headers, in the RG:Z field.
Is there a way to use this information in a similar fashion as specifying the field when the input is a bam file ? I couldn't find it.
If there is no way to do it currently, what would be your recommended way to specify barcodes ?
I tried extracting the raw barcodes from the headers in a text file, but it seems it isn't the right solution