dms-vep / SARS-CoV-2_XBB.1.5_spike_DMS

Other
5 stars 1 forks source link

Plot binding vs escape #67

Closed Bernadetadad closed 10 months ago

Bernadetadad commented 10 months ago

@jbloom this has binding_vs_escape.ipynb notebook that plots ACE2 binding vs sera escape.

Notebook should be pretty self explanatory but I calculate R for each site, filter for sites that have more than >=7 measurements and R =<-0.82. Honestly, I could decrease R filter value but site 371 would not be included because it has an annoying F371L outlier (otherwise inverse correlation for this site would be very high). Mutation count >=7 filter is mostly determined by site 570 which we have promising mass photometry measurements for and it obviously has strong inverse R.

The other thing to note is that I calculate R including only mutations that have functional score >=-1.5, relative to including all mutations regardless of functional score that changes R value quite a bit for some sites (before I did that and this was the reason why corr() did not match transform_regression value), I think this is the right thing to do because ultimately those are the only sites that we want to display. Does this seem reasonable?

Some sites with structural modulation like 167 and 234 don't end up being plotted here because they are either all bad or all good for binding and I think they just don't have the range for high correlation. We could still consider plotting them separately.

Finally, at the bottom there's the same binding vs escape plot for select sites that we know are targeted by antibodies. I think it's good to have this for comparison.

I wonder if plots like these would have been the best way to select sites for validate in mass photometry, oh well ;/

@jbloom If you could incorporate this into snakemake pipeline that would be great.

jbloom commented 10 months ago

Merging as a scratch notebook, will work on moving it to the pipeline in another pull request.