petercombs / MPRA_selection

0 stars 1 forks source link

Come up with a figure or three on biological results #9

Open petercombs opened 5 years ago

petercombs commented 5 years ago

My first thought here is to just have a scatterplot with log10 pvalue for upregulation on the x axis, and log10 pvalue for downregulation on the y axis.

  |  p               p p 
l |
o |
g |  p                  
p |     e    e
  |   e e        p     p   p
 +---------------------
           log p 

Where p are the promoters and e are the enhancers.

petercombs commented 5 years ago

Okay, so let's try plotting a polarized -log10 pvalue—if KxKn is greater than 1, it is to the right/above the axis by an amount proportional to the -log10 pvalue.

scores = pd.read_table('ElementWiseScores - Sheet1.tsv', index_col=0)
for i,e in enumerate(scores.Type.unique()):
    hits = scores.loc[scores.Type == e]
    scatter(- np.sign(log(hits.KdKn)) * np.log10(hits.p_KdKn), - 
np.sign(log(hits.KuKn))*np.log10(hits.p_KuKn), marker=['p', '*'][i], label=e)
lspine = ax.spines['left']
rspine = ax.spines['right']
rspine.set_alpha(0)
lspine.set_position(('data', 0))
ax.spines['bottom'].set_position(('data', 0))
ax.spines['top'].set_alpha(0)
legend(loc='center left')
for ix in scores.index:
    text(-log10(scores.p_KdKn[ix])*sign(log(scores.KdKn[ix])), -log10(scores.p_KuKn[ix])*sign(log(scores.KuKn[ix])), ix)

image

Okay, so that figure is really busy, and actually dispels a trend I thought I saw that KdKn (x axis) was significantly more likely to be less than 1 (not obviously true), while KuKn was more likely to be above 1 (almost certainly true).

petercombs commented 5 years ago

From Hunter:

could you try randomly shuffling the expr change assigned to each mutation and see if the significant results go away, as a negative control?

and could you generate all the trees with up/down-reg indicated on each branch (either by color or numbers)? we could pick a couple for the main figs, and the rest could go in the supp.