Closed weizhiting closed 8 months ago
What you observed is quite normal. (1) In RNA-seq, the reads from BCR/TCR is not that many since there are so many other expressed genes. Furthermore, BCR/TCR expression also depends on the infiltration level of immune cells. It is possible that the tumor tissue does not have much B/T cells hence low BCR/TCR expression. (2) BCRs are usually more expressed than TCRs, especially for the plasma B cells in tissue. In BCRs, light chain is usually more expressed than heavy chain. There is even a phenomenon as free light chains, where the secreted light chains do not associate with any heavy chain. (3) If you are interested in particular CDR3s, then you can focus on the ones with abundances of more than 1. If you are interested in diversity analysis, you should consider those singletons.
@mourisl could you please point me in the direction of some literature regarding #2 above? Specifically, BCR lamba/kappa expression > heavy chain? Thank you.
@tdfy We confirmed the observation in TRUST4's manuscript Supplementary Figure 5b based on 10x single-cell data (https://static-content.springer.com/esm/art%3A10.1038%2Fs41592-021-01142-2/MediaObjects/41592_2021_1142_MOESM1_ESM.pdf). There should be other reference literature but might not mention this explicitly.
thank you @mourisl !
Hi, first thanks for the great tools. I runTRUST4 on bulk RNAseq datasets of solid tumor samples. About the results in report file, i have several questions:
(1) every samples in my dataset have about 100 million reads, but the total number of reads mapped to BCR/TCR are only tens of thousands and even several hundreds; furthermore, 99% of the reads are mapped to BCR, is this situation normal?
(2) Of the reads that mapped to BCR, 95% are mapped to BCR light chain, only 5% are mapped to BCR heavy chain. In my opinion, reads mapped to heavy chain and light chain should be equal. Am i right?
(3) If the results are right, should i do some quality control before downstream analysis? For example, filter out the clones that only have several reads, because those clones are likely to be false positives? If there is no need to do so, why? Attached is outputs of several samples in my cohort. 97PT1-1.txt 99PT1-1.txt 164PT1-1.txt 165PT1-1.txt 167PT1-1.txt 95PT1-1.txt
Thanks for your reply.