Closed Bernadetadad closed 10 months ago
This goes for section showing non-RBD escape is mostly up down.
If there's a way to automatically pick out sites where ACE2 affinity vs escape has a very strong negative correlation that would maybe be ideal for finding sites that escape sera not via direct antibody binding but via some spike conformational change. Some sites of course have relatively few mutations measured so perhaps we should only at sites that have >3 mutations measured.
This looks promising. Sites that don't have as good correlation (167, 234) are basically one that don't have the range so changes there are either all good (234) or all bad (167) for ACE2 binding
For comparison what we consider antibody targeted sites. 473 shows some correlation becasue it's basically all bad for binding and all bad for escape
@Bernadetadad, nice job on your notebook on this, both on the coding it up but especially really figuring out what is happening with all these trends.
I integrated into pipeline, with the main change that I made the big plot potentially show all sites and then which ones to show can be filtered by a slider on the correlation.
More specifically for the figure, I made plots that I think we could show in Figure 4. Let me know what you think, and re-open issue if needed.
The notebook is here and high quality logo plot images as SVGs are here.
Basically, I divided spike into three chunks:
For each region, I found the 7 sites with the largest effect escape mutations, and then made both the correlation plots you had made before and ACE2 affinity colored logo plots.
Logo plots colored so ACE2 binding decreasing mutations are purple and binding enhancing ones green:
For the RBD ACE2 proximal, there is little correlation between escape and ACE2 binding, as expected if mutations here mostly just knock off neutralizing antibodies directly:
But as you had noted, for non-RBD the ACE2 binding and escape are strongly inversely correlated, basically suggesting the escape is by putting RBD in down conformation to escape RBD antibodies:
The same is true for RBD ACE2 distal as for non-RBD. Nicely, I think there is prior work showing that mutations at 371 (and maybe 375 / 376?) do put RBD in down:
What do you think of making figure around these?
Looks great - it's really cool to understand something fundamental about spike. Should we plot this for RBD libraries as well? here we filter out a lot of sites because they don't pass the at least 7 mutation requirement and RBD libraries will allow to look at those sites better. I know RBD library ACE2 measurements maybe are not as good but just eyeballing it kind looks like they are good enough to show these patterns (or at least worth taking a look).
Yes, definitely could. Although I sort of think we should use above for paper figure as showing non-RBD is so compelling. I was thinking of first getting back to paper, and maybe after we write around this we could return to additional analyses as needed? But let me know if you think others needed for main figures and in that case I can work on it now.
@jbloom looks like when sites are reclassified as proximal and distal some high inverse correlation sites drop out. These are the top (RBD) inverse correlation sites without classification
But these are the sites that come up once classified
What happened to e.g. 367 site? It's certainly ACE2 distal (definitely more distant than 371)
Just to clarify, my code in notebook is not looking for sites with strongest correlation. It is looking for sites with strongest maximal escape mutation in each region grouping.
My rationale is that for main text paper, it is more convincing to argue there is inverse correlation if we say "We looked at sites with top escape mutations in each region, and for NTD and RBD non proximal, they all have this correlation." If we say "We looked for sites with top correlation and they had the correlation" that sounds more circular.
Does that make sense? It could also make sense to show sites of top correlation, but that would have to be done under a different rationale?
That's okay, although I would keep in mind that perhaps the most evolutionarily relevant sites would have average escape because the strongest escape at these sites is associated with the strongest defects in ACE2 binding.
For conformational escape section in the paper we want to have a figure that correlates ACE2 affinity vs serum escape for specific mutations. @jbloom could you make this as a custom rule where it would be easy to add/remove additional sites for comparison. My current list for these sites includes:167, 200, 385, 234,468 572, 839, 852 and 856. If we don't want to have this in the repo specifically then it's easy for me to do it.