Suboptimal representation of benchmarked methods

brianhie commented 3 years ago

Superfluous alphabet characters and different site ranges caused mismatch between in silico DMS and validation data. Also leads to comparison issues with baseline models.

Ambiguous amino acids added into alphabet, e.g., https://github.com/brianhie/viral-mutation/blob/f368b5175fda439d2526e3f89f8cd790ad2d8b07/results/cov/semantics/analyze_semantics_cov2rbd_bilstm_512.txt#L3
Benchmark vocabulary had incorrect 'U' character, CSCS vocabulary had incorrect 'B', 'Z', 'J', 'U', 'X', and 'Z' characters.
DMS escape experiments do not test full protein/protein domain affecting both benchmarking and CSCS validation.

Issue and scope of changes to results are under active investigation.

brianhie commented 3 years ago

After initial investigation, issues were addressed and in summary (and thankfully) do not affect the conclusions of our paper although some updates do need to take place.

In silico DMS now aligns with in vitro DMS results, e.g., here (https://github.com/brianhie/viral-mutation/blob/167d9e0cf0135194b09e87f090b8f7fbfc470aae/bin/cached_semantics.py#L72) and here (https://github.com/brianhie/viral-mutation/blob/167d9e0cf0135194b09e87f090b8f7fbfc470aae/bin/escape_energy.py#L148).
Vocabulary issue fixed here (https://github.com/brianhie/viral-mutation/blob/167d9e0cf0135194b09e87f090b8f7fbfc470aae/bin/cached_semantics.py#L66) and for benchmarking here (https://github.com/brianhie/viral-mutation/blob/167d9e0cf0135194b09e87f090b8f7fbfc470aae/bin/escape_energy.py#L370).
Additional issue of parameter mismatch in CoV-2 DMS between our experiments and original author experiments fixed here (https://github.com/brianhie/viral-mutation/blob/167d9e0cf0135194b09e87f090b8f7fbfc470aae/bin/escape.py#L163).
Semantic change benchmarks were being assessed in the opposite direction, fixed, e.g., here (https://github.com/brianhie/viral-mutation/blob/702f0ea2fcd475996e562bb8b503dc57aaaead1f/bin/escape_energy.py#L298)

Issues were leading to, in particular, suboptimal representation of baseline methods. However, CSCS with model still outperforms benchmarks on all DMS datasets tested across the full range of cutoffs defining an escape mutation:

path49142-6

In plot above, dashed line indicates representative escape cutoff reported in initial paper. CSCS predictive performance increases with escape cutoff stringency and consistently outperforms baseline methods especially at the most stringent antibody selection cutoffs, where the assayed mutations have the strongest experimental evidence of escape potential.

Investigation is ongoing, including updates to original paper.

brianhie commented 3 years ago

Instructions for reproducing new benchmarks have been added to README here: https://github.com/brianhie/viral-mutation#benchmarking-experiments with updates to the data tar ball as well.

brianhie commented 3 years ago

AUCs have been updated at results/escape_results.txt.

Cutoff experiment has also inspired a new analysis showing how increasing the stringency on the experimental evidence of fitness or of (loss of) antibody binding also results in better predictive performance of grammaticality or semantic change, respectively, consistent with our biological hypotheses! This new analysis along with above benchmarking cutoffs analysis and new variant analysis has been added to our postscript at results/HZBB21_Postscript_v0.pdf.

brianhie commented 3 years ago

Paper updated, closing.

brianhie / viral-mutation

Suboptimal representation of benchmarked methods #5