dms-vep / SARS-CoV-2_XBB.1.5_spike_DMS

Other
5 stars 1 forks source link

ACE2 vs affinity correlations for specific sites #55

Closed Bernadetadad closed 10 months ago

Bernadetadad commented 11 months ago

For conformational escape section in the paper we want to have a figure that correlates ACE2 affinity vs serum escape for specific mutations. @jbloom could you make this as a custom rule where it would be easy to add/remove additional sites for comparison. My current list for these sites includes:167, 200, 385, 234,468 572, 839, 852 and 856. If we don't want to have this in the repo specifically then it's easy for me to do it.

jbloom commented 11 months ago

This goes for section showing non-RBD escape is mostly up down.

Bernadetadad commented 11 months ago

If there's a way to automatically pick out sites where ACE2 affinity vs escape has a very strong negative correlation that would maybe be ideal for finding sites that escape sera not via direct antibody binding but via some spike conformational change. Some sites of course have relatively few mutations measured so perhaps we should only at sites that have >3 mutations measured.

Bernadetadad commented 10 months ago

This looks promising. Sites that don't have as good correlation (167, 234) are basically one that don't have the range so changes there are either all good (234) or all bad (167) for ACE2 binding image

Bernadetadad commented 10 months ago

For comparison what we consider antibody targeted sites. 473 shows some correlation becasue it's basically all bad for binding and all bad for escape image

jbloom commented 10 months ago

@Bernadetadad, nice job on your notebook on this, both on the coding it up but especially really figuring out what is happening with all these trends.

I integrated into pipeline, with the main change that I made the big plot potentially show all sites and then which ones to show can be filtered by a slider on the correlation.

More specifically for the figure, I made plots that I think we could show in Figure 4. Let me know what you think, and re-open issue if needed.

The notebook is here and high quality logo plot images as SVGs are here.

Basically, I divided spike into three chunks:

For each region, I found the 7 sites with the largest effect escape mutations, and then made both the correlation plots you had made before and ACE2 affinity colored logo plots.

Logo plots colored so ACE2 binding decreasing mutations are purple and binding enhancing ones green:

image

For the RBD ACE2 proximal, there is little correlation between escape and ACE2 binding, as expected if mutations here mostly just knock off neutralizing antibodies directly:

image

But as you had noted, for non-RBD the ACE2 binding and escape are strongly inversely correlated, basically suggesting the escape is by putting RBD in down conformation to escape RBD antibodies:

image

The same is true for RBD ACE2 distal as for non-RBD. Nicely, I think there is prior work showing that mutations at 371 (and maybe 375 / 376?) do put RBD in down:

image

What do you think of making figure around these?

Bernadetadad commented 10 months ago

Looks great - it's really cool to understand something fundamental about spike. Should we plot this for RBD libraries as well? here we filter out a lot of sites because they don't pass the at least 7 mutation requirement and RBD libraries will allow to look at those sites better. I know RBD library ACE2 measurements maybe are not as good but just eyeballing it kind looks like they are good enough to show these patterns (or at least worth taking a look).

jbloom commented 10 months ago

Yes, definitely could. Although I sort of think we should use above for paper figure as showing non-RBD is so compelling. I was thinking of first getting back to paper, and maybe after we write around this we could return to additional analyses as needed? But let me know if you think others needed for main figures and in that case I can work on it now.

Bernadetadad commented 10 months ago

@jbloom looks like when sites are reclassified as proximal and distal some high inverse correlation sites drop out. These are the top (RBD) inverse correlation sites without classification

image
Bernadetadad commented 10 months ago

But these are the sites that come up once classified

image
Bernadetadad commented 10 months ago

What happened to e.g. 367 site? It's certainly ACE2 distal (definitely more distant than 371)

jbloom commented 10 months ago

Just to clarify, my code in notebook is not looking for sites with strongest correlation. It is looking for sites with strongest maximal escape mutation in each region grouping.

My rationale is that for main text paper, it is more convincing to argue there is inverse correlation if we say "We looked at sites with top escape mutations in each region, and for NTD and RBD non proximal, they all have this correlation." If we say "We looked for sites with top correlation and they had the correlation" that sounds more circular.

Does that make sense? It could also make sense to show sites of top correlation, but that would have to be done under a different rationale?

Bernadetadad commented 10 months ago

That's okay, although I would keep in mind that perhaps the most evolutionarily relevant sites would have average escape because the strongest escape at these sites is associated with the strongest defects in ACE2 binding.