liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
256 stars 46 forks source link

umis and clonotype of trust-barcoderep-to-10X.pl results #282

Open xingyongma opened 1 week ago

xingyongma commented 1 week ago

I used the trust-barcoderep-to-10X.pl to transform barcode_report, and got two questions: 1) Why are the numbers of umis and reads exactly the same in the results? 2) The raw_clonotype_id and raw_consensus_id are all None. Can I use turst4 to analyze and obtain clonotype information?

The command I ran run the software is as follows: run-trust4 \ -f hg38_bcrtcr.fa \ --ref human_IMGT+C.fa \ -t 15 \ -u R2_001.fastq.gz \ --barcode R1_001.fastq.gz \ --UMI R1_001.fastq.gz \ --readFormat bc:0:16,um:17:28,r1:29:-1 \ --barcodeWhitelist P3CB.barcode.txt.gz

mourisl commented 1 week ago

When using UMI, the abundance information is transformed UMI count. I think the read count information is stored, so they are the same in the output.

I think the clonotype_id in 10X's definition is not very clear. I think I saw some cases where the CDR3, V,J, C genes were the same but still had a different "Clonotype" ID. Therefore, I leave those as None. Is your downstream analysis workflow depend on this?

xingyongma commented 1 week ago

Thanks a lot. Using clonotype information, my main purpose is to examine clone sizes in order to observe the expansion and contraction of clones.

mourisl commented 1 week ago

I think you can use the CDR3 sequence to define the clonotype.

xingyongma commented 1 week ago

I will have a try. And, thank you for all your quick replies.