dms-vep / SARS-CoV-2_XBB.1.5_spike_DMS

Other
5 stars 1 forks source link

show receptor affinity and better sliders via `dms-vep-pipeline-3` #13

Closed jbloom closed 1 year ago

jbloom commented 1 year ago

@Bernadetadad, this pull request makes some significant updates / changes. It updates to dms-vep-pipeline-3 version 3.2.2, which has the improvements described in the CHANGELOG here.

Briefly:

Also, if you merge, do it with the Squash and merge rather than the Merge pull request option in the merge button.

Bernadetadad commented 1 year ago

Thanks @jbloom, I will merge RBD library escape and ACE2 affinity repos in the same way once I get back sera selection data later this week.

jbloom commented 1 year ago

Migrating to dms-vep-pipeline-3

I am posting detailed notes for people how to migrate to the new dms-vep-pipeline-3 (version 3.2.2).

Substantive improvements

Receptor affinity

You can now enter receptor affinity experiments, and get plots like this one that show how mutations affect receptor affinity. Note that these are present in a separate section called Receptor affinity in the main docs of a repo.

To add receptor affinity, for backwards compatibility the information is added under the keys receptor_selections and avg_receptor_selections in your antibody_escape_config.yaml file as here. Note that the internal terminology still refers to the selections with soluble receptor as "antibody" just like "antibody" is also used as a synonym for serum. Note that you will likely want to change some of the regularization weights and plotting parameters for receptors. For instance, we no longer want to have a nonzero reg_spread_weight as probably most mutations at a site will not have same effect on receptor.

The code automatically takes care of the adjustment that "escape" from a soluble receptor means decreased affinity for it.

Plots better handle negative functional effect mutations

Previously in the heat maps, if a mutation could not have an escape estimated for it (due to insufficient no-antibody counts) it was shown as missing. But actually, there are mutations that we can measure as being functionally deleterious, but we can't measure escape as they don't make functional virus even in no-antibody condition. Now those are shown as grayed out (deleterious) depending on the value of the functional effect slider, so that we can distinguish between non-measured mutations and those just too deleterious to measure escape for but in library. Note we have also added the min_filters option under plot_hide_stats that you should use to set minimal filtering (eg, on times_seen) in order to use a functional effect measurement.

CSVs for average antibody escape also report per-model values

The CSV files with the antibody escape values averaged over selections also now have the per-selection (replicate) values.

Some internal re-working of pipeline and output file names

There are some tweaks to the internal re-working and output file names that mostly should not affect you except for the following point in the next section

Migrating to new pipeline

Just pull the latest version of the dms-vep-pipeline-3 (currently 3.2.2) into your repo.

Then since some output file names have changed, you will need to update your .gitignore to look like the one here plus whatever additional stuff you needed to add for your specific repo.

Then since some output file names have changed, entirely delete your ./results/ and ./docs/ folders after you have updated .gitignore, re-run pipeline, and commit with new outputs.