liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
283 stars 49 forks source link

10X fastq barcode+index and bam for Trust4 #305

Open olu2016 opened 2 months ago

olu2016 commented 2 months ago

Hi,

I have tcr fastq datasets, and the barcode+umi is 20bp (NCCCAAGGGT+AAAGGTAGTA) instead of the usual 26bp. I wondered if Trust4 could work successfully in this scenario?

Another question is: One of the bam files in the vdj_t folder (from cellranger multi pipeline) is consensus.bam, can Trust4 use this kind of bam file as input?

Thanks

mourisl commented 2 months ago

I have tcr fastq datasets, and the barcode+umi is 20bp (NCCCAAGGGT+AAAGGTAGTA) instead of the usual 26bp. I wondered if Trust4 could work successfully in this scenario?

Yes, the format should be fine. You just need to adjust the --readFormat option accordingly.

One of the bam files in the vdj_t folder (from cellranger multi pipeline) is consensus.bam, can Trust4 use this kind of bam file as input?

Is consensus.bam file already the result from cellranger's assembly? While TRUST4 can take it as input, I don't see the need for that. Do you mean you want to reannotate those cellranger's assemblies?

olu2016 commented 2 months ago

Thanks for your reply. Yes, I just wanted the reannotation of the assemblies.

mourisl commented 2 months ago

It might not be directly supported, as the consensus.bam, is not the alignment to the reference genome, so the chromosome IDs will cause some issues. You may need to convert the BAM file to fastq file first.

olu2016 commented 2 months ago

Thanks a lot. I used the raw data in fastq format and the 26 bp (barcode+format) format and my Trust4 run was successful. The 20 bp format that I earlier indicated was wrong.

Have a good day!