FunctionLab / ExPecto

predicting expression effects of human genome variants ab initio from sequence
117 stars 41 forks source link

averaging forward and reverse strand from output of chromatin.py #22

Open cmlakhan opened 3 years ago

cmlakhan commented 3 years ago

For other purposes I wanted to get a single chromatin effect score for each SNP from what I can tell if

X = num_variants

then for each SNP i then the forward and backward strands are at index i and i+X, is that correct? So if I want a single score then I should average row i and row i+X? That is the impression I get from the code in predict.py below:

snp_temp = (np.asarray(h5f[index_start:index_end,:])+ np.asarray(h5f[index_start+int(h5f.shape[0]/2):index_end+int(h5f.shape[0]/2),:]))/2.0

Just wanted to clarify, thanks!

jzthree commented 3 years ago

Thanks for the question! Yes that is right. The first half of the chromatin predictions are computed from the forward strand sequences and the second half is for the same number of reverse complement sequences.

cmlakhan commented 3 years ago

Thanks for clarifying! You typically just average the two values, correct?