dms-vep / LASV_Josiah_GP_DMS

Analysis of DMS data on Lassa virus GP using dms-vep-pipeline-3
0 stars 0 forks source link

Compare DMS scores for each region as defined in Figure 2C to the natural variability (i.e., effective amino acids) for that region #13

Closed Caleb-Carr closed 4 weeks ago

Caleb-Carr commented 1 month ago

This issue is related to the following reviewer comment:

1. The GP2 domains are more functionally constrained than GP1. How conserved are these epitopes among different Lassa lineages?

It seems like they are basically asking for us to assess if GP2 is more constrained in terms of natural variation than GP1. Maybe you could analyze the relative variability (using number of effective amino acids averaged across sites) for each domain as defined in Figure 2C and see to what extent domain-level constraint from DMS corresponds to natural variability.

Caleb-Carr commented 1 month ago

@jbloom This notebook analyzes the functional effects of mutations compared to natural variation (effective amino acids) at each site for different GPC regions. Note that this is site level analysis and the last plot differs from Figure 2C because mutation effects are averaged across each site to be consistent with the site level effective amino acid metric.

jbloom commented 1 month ago

OK, we will have to figure out what exactly to show to address reviewer comment. Comment is more about domains than all the other more fine-grained subsets of sites. Also, I wonder if site entropy would be better than effective amino acids since it won't collapse the smaller values and exaggerate the larger ones as much.

Caleb-Carr commented 1 month ago

@jbloom Updated notebook to only show the three main domains of GPC (SSP, GP1, and GP2) and changed metric to site entropy. To more directly address the question of GP2 being more constrained than GP1, I did a Mann-Whitney U test for the site entropy between each of the regions.

jbloom commented 1 month ago

Looks good. Maybe make the points a bit darker or draw a thin black border (using stroke, strokeWidth in altair) as they are hard to see in current plots.

Caleb-Carr commented 4 weeks ago

Im closing this issue because the latest plot has points altered to be more visible