Open junhaobearxiong opened 10 months ago
Hi Bear, Thanks for trying this out! THese are all great questions. As you can see, TCR specificity prediction is a very hard problem. Here are a few thoughts:
Some of the things I am trying right now are
I am also excited to try the new version of AlphaFold (https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/a-glimpse-of-the-next-generation-of-alphafold/alphafold_latest_oct2023.pdf) if/when it is released.
Excited to hear any thoughts you have on these issues! Take care, Phil
From: Junhao (Bear) Xiong @.> Sent: Saturday, November 18, 2023 11:33 AM To: phbradley/TCRdock @.> Cc: Subscribed @.***> Subject: [phbradley/TCRdock] Decoy discrimination benchmark using finetuned model (Issue #10)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Phil!
Thank you so much for all the work into creating and curating this great resource! I've been interested in leveraging TCRdock for predicting TCR binding specificity for some peptides of interest, and I've been trying to reproduce the decoy discrimination results in the paper as a first step sanity check. The main difference from the procedure described in the paper is that I used the finetuned model and the updated protocol described in #7https://urldefense.com/v3/__https://github.com/phbradley/TCRdock/issues/7__;!!GuAItXPztq0!kpN3zH0EVL9S3uMWJ_bLtBGw__TySN7mBZ-7NePbrekCBndeeLrm5pznXCeIU_fxruejDomwpQ3UgSmVAV1Pb7k4$, namely:
The reasons why I used the finetuned model are that it is more efficient to run, and supposed to generate higher quality predicted structures (which, by the results of the paper, correlates with better decoy discrimination), but I'm wondering whether you would expect decoy discrimination to generally improve with the finetuned model? I'm asking because from the initial analyses I did, the results seem to be quite different from the results in the paper, so I want to double check if it is what you would expect:
Again, thank you so much for this great resource!
Best, Bear
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/phbradley/TCRdock/issues/10__;!!GuAItXPztq0!kpN3zH0EVL9S3uMWJ_bLtBGw__TySN7mBZ-7NePbrekCBndeeLrm5pznXCeIU_fxruejDomwpQ3UgSmVAeRas28Q$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABBNCHYVG6IPEO6YEALDHF3YFEEPRAVCNFSM6AAAAAA7RFRAMWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAYDANJVGAZDONY__;!!GuAItXPztq0!kpN3zH0EVL9S3uMWJ_bLtBGw__TySN7mBZ-7NePbrekCBndeeLrm5pznXCeIU_fxruejDomwpQ3UgSmVAQ6fRPx-$. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Phil!
Thank you so much for the detailed reply! Indeed, specificity prediction is a challenging problem, and I've also been interested in exploring potential avenues to improve it. I will try to run the original setup in the paper (i.e. 3 AlphaFold runs per target, using non-finetuned parameters, excluding nearby templates) and see if the results are consistent.
If I may ask, I'm curious about what you mean by "generating more data for training"? Would it be generating more structure data for training structure prediction, or using binding data as in the Motmaen et al. paper?
I'm excited to hear about the new things you are trying!
Best, Bear
Hi Bear, Right, it would be generating (and finding!) more binding data for training the models. Though more structure would certainly help, too. They are just harder to come by... Take care, Phil
From: Junhao (Bear) Xiong @.> Sent: Sunday, November 19, 2023 10:13 AM To: phbradley/TCRdock @.> Cc: Bradley PhD, Phil @.>; Comment @.> Subject: Re: [phbradley/TCRdock] Decoy discrimination benchmark using finetuned model (Issue #10)
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Phil!
Thank you so much for the detailed reply! Indeed, specificity prediction is a challenging problem, and I've been interested in exploring potential avenues to improve it. I will try to run the original setup in the paper (i.e. 3 AlphaFold runs per target, using non-finetuned parameters, excluding nearby templates) and see if the results are consistent.
If I may ask, I'm curious about what you mean by "generating more data for training"? Would it be generating more structure data for training structure prediction, or using binding data as in the Motmaen et al. paper?
I'm excited to hear about the new things you are trying!
Best, Bear
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/phbradley/TCRdock/issues/10*issuecomment-1817935510__;Iw!!GuAItXPztq0!mL95-UYpdGF8s_5u5EqP4ZNdXzV3vRrJgQNA98859kIDJRIRM35yuZ7Cpha2ftqvGoc2PC0Qqzz_hNRMghmfv1I0$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABBNCH3433LPCH44RBBIMBTYFJD5NAVCNFSM6AAAAAA7RFRAMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJXHEZTKNJRGA__;!!GuAItXPztq0!mL95-UYpdGF8s_5u5EqP4ZNdXzV3vRrJgQNA98859kIDJRIRM35yuZ7Cpha2ftqvGoc2PC0Qqzz_hNRMgs3Ow0K9$. You are receiving this because you commented.Message ID: @.***>
Hi Phil!
Thank you so much for all the work into creating and curating this great resource! I've been interested in leveraging TCRdock for predicting TCR binding specificity for some peptides of interest, and I've been trying to reproduce the decoy discrimination results in the paper as a first step sanity check. The main difference from the procedure described in the paper is that I used the finetuned model and the updated protocol described in #7, namely:
The reasons why I used the finetuned model are that it is more efficient to run, and supposed to generate higher quality predicted structures (which, by the results of the paper, correlates with better decoy discrimination), but I'm wondering whether you would expect decoy discrimination to generally improve with the finetuned model? I'm asking because from the initial analyses I did, the results seem to be quite different from the results in the paper, so I want to double check if it is what you would expect:
wt_binding_score
column indatasets_from_the_paper/table_S2_specificity_benchmark_tcrs.csv
. I've limited my analyses to the 6 human peptides from the paper, and used the 50 human background TCRs in the aforementioned csv file for the pMHC-intrinsic effect correction.Again, thank you so much for this great resource!
Best, Bear