Correlation between averaged library A and library B functional effects

Caleb-Carr commented 7 months ago

Create a correlation plot for functional effects measured in 293T cells between library A and library B after averaging each of the four technical replicates in each library. Default correlation plots from pipeline show the individual technical replicate correlations but this will display the correlation after averaging the individual technical replicates.

Caleb-Carr commented 7 months ago

@jbloom This notebook calculates library correlations for functional effects. Based on space for Figure S6, I think the heatmap correlation would be better than the scatter plot? Min times seen 2 is the default used for all heatmaps

jbloom commented 7 months ago

What reviewer question is this addressing? I agree plot looks good in any case.

Caleb-Carr commented 7 months ago

This is related to the following question:

Reviewer #1: The mutational effects of some individual positions (not clarified in the manuscript) are inferred using a epistasis model. Predicting epistatic interactions is extremely challenging. Therefore, it is unclear how reliable for those inferred effects. To address this issue, a library with complete mutation sites and enough copy-numbers for coverage of all possible single-site mutations is necessary.

This is an excellent point. Our libraries each included ~50,000 barcoded variants (Figure S1D), which is approximately 5x coverage for all possible amino-acid mutations in each of the duplicate libraries. This means most mutations are sampled multiple times in each library, which provides good sampling, as indicated by the fact that the measured effects of mutations are highly correlated between the two independent libraries.

The reviewer is correct that the data are analyzed via global epistasis models, which are relatively simple models of epistasis that have been shown to be useful for analyzing deep mutational scanning data. We have revised the text to more clearly note this point. In addition, as described in response to the reviewer’s major issue 2 immediately above, we have performed additional analyses comparing the effects after the analysis with the global epistasis model to just the effects measured in the single-mutant variants in the library. As elaborated in that response below, the effects are highly correlated, supporting use of the global epistasis models as a reliable way to analyze these data.

The correlation plot is related to the sentence about the mutations being correlated between the two independent libraries. Currently, we do not have any library correlation plot in the paper. However, because the main point of the reviewers question is about the epistasis modeling, this correlation plot might not be crucial to add to the Figures depending on space? The crucial one I think is related to this issue.

dms-vep / LASV_Josiah_GP_DMS

Correlation between averaged library A and library B functional effects #11