liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data
MIT License
262 stars 46 forks source link

Clone reconstruction #173

Open monikazelazow opened 1 year ago

monikazelazow commented 1 year ago

Hello, I am using TRUST4 on my RNAseq data to reconstruct BCR sequences. Since I am specifically interested in clonal reconstruction, sequence_id guides me on number and sizes of clonal groups (to be precise assemble#). I noticed that the algorithm is using V and J genes as well as length of CDR3 to identify individual clones and group BCR sequences. Looking at the CDR-H3 amino acid sequences of heavy chains grouped in one clonal group, many times, I do not agree with the assignment. I see CDR3s that differ between each other by 50% and are still treated as one clonal group. This is especially misleading in highly diverse repertoire where many single-sequence-clonal groups are being assigned to bigger clones.

mourisl commented 1 year ago

Thank you for the comments. I assume you used the script trust-cluster.py for the clustering. There are several parameters that can control the cluster size. You can use a larger value of "-s" to cluster more similar CDR3s. You can add the option "--center" so the leaf node can not be too different from the center of the cluster.

monikazelazow commented 1 year ago

Thank you for your response. What is the range of "-s" that I can apply?

mourisl commented 1 year ago

I think you can try -s 0.9 and --center so that the minimal similarity between two CDR3 would be roughly 80%.

monikazelazow commented 1 year ago

I am sorry to bother you one more time.

Is there a way to add amino acid sequence of CDR3 to the output file? Also what is the header of this file?

Monika Zelazowska


From: Li Song @.> Sent: Wednesday, December 21, 2022 09:17 AM To: liulab-dfci/TRUST4 @.> Cc: Zelazowska,Monika A @.>; Author @.> Subject: [EXTERNAL] Re: [liulab-dfci/TRUST4] Clone reconstruction (Issue #173)

THIS EMAIL IS A PHISHING RISK

Do you trust the sender? The email address is: @.*** While this email has passed our filters, we need you to review with caution before taking any action. If the email looks at all suspicious, click the Report a Phish button.

I think you can try -s 0.9 and --center so that the minimal similarity between two CDR3 would be roughly 80%.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/liulab-dfci/TRUST4/issues/173*issuecomment-1361469661__;Iw!!PfbeBCCAmug!nuVj39UTsi1vkCIo1AuzNN05p8-NTKWW32xmE6En5XX8xhygIPDChgR5nEG2oIXa_7dvRdPUv-NDLGBN1GsJEY0LLxsulZQ$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/A44FPCHQ6VR4O5TTVRB4BT3WOMNRDANCNFSM6AAAAAATE5VQOA__;!!PfbeBCCAmug!nuVj39UTsi1vkCIo1AuzNN05p8-NTKWW32xmE6En5XX8xhygIPDChgR5nEG2oIXa_7dvRdPUv-NDLGBN1GsJEY0LbPjsjoU$. You are receiving this because you authored the thread.Message ID: @.***>

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.