Closed caiquanyou closed 3 years ago
Thanks for the report!
It's also not immediately clear to me why this happens. Could you please run the following lines before .reconcile_models()
to save these objects as pickle files and post it here? That may help track down the problem. Thanks!
import pickle
with open("debug_hits_dist.pkl", "wb") as f:
pickle.dump(data_obj2_hits.dist, f)
with open("debug_hits_pval.pkl", "wb") as f:
pickle.dump(data_obj2_hits.pval, f)
@Jeff1995 Ok here are the two pkl files in the debug.zip debug.zip
Seems that I can not reproduce the error under numpy 1.14.6. I suspect it's a numpy version issue. What numpy version are you using?
I use numpy 1.17.2
I think I figured it out. It was not a numpy version problem, but rather because only one DIRECTi model was used in BLAST. In that case the singleton "model" dimension (axis=1) in the hist.dist
array was missing, so taking the mean over axis=1 referred to a non-existent axis.
If only one model was used, .reconcile_models()
is not necessary. You can just remove .reconcile_models()
and continue with downstream steps.
Meanwhile, .reconcile_models()
should also work even if only one model was used (just does nothing). It will be fixed in a future release.
But I used 4 models before: code below:
models = [] for i in range(4): models.append(cb.directi.fit_DIRECTi( data_obj, genes=selected_genes, latent_dim=10, cat_dim=20, random_seed=i )) blast = cb.blast.BLAST(models, data_obj) data_obj2_hits = blast.query(data_obj2) data_obj2_hits = data_obj2_hits.reconcile_models().filter(by="pval", cutoff=0.05) error here
Well, that would be strange... Can you confirm that the data_obj2_hits.reconcile_models()
line was not executed more than once? If that is the case, could you provide the data_obj
object (as an h5 file), and the selected_genes
object (as a text file), so I can try to reproduce the error.
I do not use selected_genes, axes = data_obj.find_variable_genes()
to produce the gene list; I use the HV gene finded before ,does it cause this problem? How could I save the data_obj
, it is created by data_obj = cb.data.ExprDataSet(exprs=adata.X, obs=adata.obs, var=adata.var, uns=adata.uns)
I think the gene list shouldn't be the cause. You can save the data_obj
with data_obj.write_dataset("filename.h5")
.
Ok,the file contain data1.h5 and gene.csv
------------------ 原始邮件 ------------------ 发件人: "gao-lab/Cell_BLAST" <notifications@github.com>; 发送时间: 2021年1月6日(星期三) 下午3:35 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "xianmao"<951463554@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] reconcile_models() problems (#12)
I think the gene list shouldn't be the cause. You can save the data_obj with data_obj.write_dataset("filename.h5").
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
从QQ邮箱发来的超大附件
data.zip (221.35M, 2021年02月05日 15:44 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=0c6331328a524fc4b34d870a16385719095b0650530c060f1c5201070015510709531c0a025b501b57065754530f060705530800303065525017501c4a5115360c&code=1c1208e6
I tried on this data (using the training data data_obj
as query since I do not have data_obj2
), but I could not reproduce the error using the following script:
import pandas as pd
import Cell_BLAST as cb
data_obj = cb.data.ExprDataSet.read_dataset("data1.h5")
selected_genes = pd.read_csv("gene.csv", index_col=0).to_numpy().ravel().tolist()
models = []
for i in range(4):
models.append(cb.directi.fit_DIRECTi(
data_obj, genes=selected_genes,
latent_dim=10, cat_dim=20, random_seed=i
))
blast = cb.blast.BLAST(models, data_obj)
data_obj_hits = blast.query(data_obj)
data_obj_hits = data_obj_hits.reconcile_models().filter(by="pval", cutoff=0.05)
print("Done!")
Could you please try running this as a Python script (not as a Jupyter notebook) and see if it works on your side?
If the error persists, it would most likely be an environment issue. You may need to provide your detailed environment specification (via conda env export
) so I can try to reproduce it.
@Jeff1995 It work as the script,but still fail in Jupyter notebook
Okay. I think the most likely cause is that you ran the following line more than once in the Jupyter notebook:
data_obj_hits = data_obj_hits.reconcile_models().filter(by="pval", cutoff=0.05)
It should be run only once. If you run it a second time it will produce the error.
Thanks!
hi @Jeff1995 , I run the code data_obj2_hits = data_obj2_hits.reconcile_models().filter(by="pval", cutoff=0.05) and get error below: IndexError Traceback (most recent call last)