dms-vep / SARS-CoV-2_XBB.1.5_spike_DMS

Other
5 stars 1 forks source link

RBD expression filter #41

Closed Bernadetadad closed 11 months ago

Bernadetadad commented 11 months ago

@jbloom attached plot adds RBD expression filter, it definitely improves correlation across the board but XBB15 RBD lenti vs XBB15 yeast still does not improve to the point one would expect (?) Dimeric ACE2 data for lenti RBD library as well as full spike data correlations with yeast look great though, maybe that experiment is just not as good. I do though like having RBD expression filter here.

Also, I think the clipping on the plots we have on the repo does not look good, I think I'm going to change those clip values like in the attached if it's fine with you. affinity_corr.html.zip

jbloom commented 11 months ago

Changing the clipping is fine. Just be sure to specify this and any other filter values in the snakemake params, not in the code itself, so we can remember how they are being clipped easily.

I don't think we need a RBD expression filter, what does that add exactly? If there is a benefit we can add it, but the benefit is not obvious to me. I don't think RBD expression in the yeast display is a significant factor here.

The issue I was noting in the paper is the poor correlation for the XBB.1.5 RBD pseudovirus DMS with the yeast display RBD DMS. Looking at the points, a lot of this seems to be due to the DMS giving dubious estimates to mutations such as N501K. Look at the points in the lower right of the plot: almost all of them seem to be caused by dubious measured effects of the RBD pseudovirus DMS that have very disparate measurements between libraries. I think we should try to figure out what is going on with that, and if we can't determine the cause potentially add a new filter. Right now we are showing a heat map that for instance indicates Y501K is highly affinity enhancing, which seems dubious. If you look at the other mutations in the lower right of the scatter plot, they are things like Y489R or P491Y where the RBD pseudovirus DMS also seems highly dubious.

Anyway, I think we should figure out what is going on with those mutations, and re-think the wisdom of showing the RBD pseudovirus heat map given some obviously wrong measurements at key sites. I think at least part of the issue with these mutations is there is poor correlation between libraries and the mutations are bad for viral entry which makes them hard to measure in our approach.

Another option would be to not show the RBD ACE2 affinity DMS heatmap and maybe even drop it from the correlation plot and instead just show a slightly smaller heatmap of ACE2 affinity effects for RBD and NTD for full spike. This would also mesh better with the point of the paper.