Open morphy380 opened 2 years ago
After selecting variable genes, the variable gene expression matrix was renormalized. See https://www.cell.com/cell/fulltext/S0092-8674(19)30039-X#secsectitle0115, Preparation of expression matrices.
Thanks for the answer. I did read that section, but still can't replicate it when renormalizing. Here is my code (with adata
and adataf
as loaded above)
I tried the following:
df = adata[:1000,:].to_df()
dff = adataf[:1000,:].to_df()
assert np.all(df.index==dff.index)
dff_hvg = dff[df.columns]
dff_hvg_norm = 10000*dff_hvg.div(dff_hvg.sum(axis=1),axis=0)
dff_hvg_norm_log = np.log1p(dff_hvg_norm)
dff_norm = 10000*dff.div(dff.sum(axis=1),axis=0)
dff_norm_hvg = dff_norm[df.columns]
dff_norm_hvg_norm = 10000*dff_norm_hvg.div(dff_norm_hvg.sum(axis=1),axis=0)
dff_norm_hvg_norm_log = np.log1p(dff_norm_hvg_norm)
plt.scatter(x=dff_hvg_norm_log['Fam150a'],y=df['Fam150a'],s=1)
plt.scatter(x=dff_norm_hvg_norm_log['Fam150a'],y=df['Fam150a'],s=1)
I am confused how the data was transformed between the variable genes matrix and the full matrix provided. It seems the variable genes matrix is normalized + log-transformed but I don't get perfect correlation after this transformation. Could you provide the code for that preprocessing ?