Closed fe4960 closed 1 year ago
Hi @Jeff1995
I found the reason is that all the values in atac.var["highly_variable"] are False. I checked the steps to preprocess atac object, and found initially atac.var["highly_variable"] contains "False" and "True" values. But after "guidance = scglue.genomics.rna_anchored_guidance_graph(rna, atac)", the atac object only contains "False" values. Please see details below. Could you help suggest how to fix it? Thanks a lot!
print(atac.var["highly_variable"].value_counts())
False 340465 True 85116 Name: highly_variable, dtype: int64
atac.var["chrom"] = split.map(lambda x: x[0]) atac.var["chromStart"] = split.map(lambda x: x[1]).astype(int) atac.var["chromEnd"] = split.map(lambda x: x[2]).astype(int) atac.var.head()
print(atac.var["highly_variable"].value_counts())
False 340465 True 85116 Name: highly_variable, dtype: int64
guidance = scglue.genomics.rna_anchored_guidance_graph(rna, atac)
print(atac.var["highly_variable"].value_counts())
False 425581 Name: highly_variable, dtype: int64
scglue.graph.check_graph(guidance, [rna, atac])
[INFO] check_graph: Checking variable coverage... [INFO] check_graph: Checking edge attributes... [INFO] check_graph: Checking self-loops... [INFO] check_graph: Checking graph symmetry... [INFO] check_graph: All checks passed!
print(atac.var["highly_variable"].value_counts())
False 425581 Name: highly_variable, dtype: int64
The scglue.genomics.rna_anchored_guidance_graph
function propagates highly variable genes in the RNA modality to the ATAC modality based on the guidance graph, which in turn is constructed based on genomic overlap between ATAC peaks and RNA gene body + promoter regions by default.
So in principle, this can only happen if all highly variable RNA genes do not overlap any ATAC peaks in the gene body + promoter region, which should be rare. Could you check how many RNA genes were marked as "highly variable" in your data? Could it be that all RNA genes were marked as False? Also, for trouble shooting, you may also try setting extend_range
to larger values like 150000 to see if that covers more ATAC peaks.
Thanks a lot for reply! I found the error is due to the chromosome names in gene annotation gtf file do not have "chr". But the ATAC peaks have "chr" for the coordinates. After modifing the gtf file, the error was fixed.
Hello,
Thanks a lot for developing this great package. I encountered some error when running scglue.models.fit_SCGLUE. The scglue version is 0.3.2, and I run it in the computational node with GPU.
I wonder if you could help fix this error. Thanks very much.
glue = scglue.models.fit_SCGLUE( ... {"rna": rna, "atac": atac}, guidance_hvf, ... fit_kws={"directory": "glue"} #,init_kws={"random_seed": 0}", line 1, in
File "software/anaconda3/envs/scvi-env/lib/python3.9/site-packages/scglue/models/init.py", line 204, in fit_SCGLUE
pretrain = model(adatas, sorted(graph.nodes), **pretrain_init_kws)
File "software/anaconda3/envs/scvi-env/lib/python3.9/site-packages/scglue/models/scglue.py", line 729, in init
if idx[k].min() < 0:
File "software/anaconda3/envs/scvi-env/lib/python3.9/site-packages/numpy/core/_methods.py", line 44, in _amin
return umr_minimum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation minimum which has no identity
... ) [INFO] fit_SCGLUE: Pretraining SCGLUE model... Traceback (most recent call last): File "
Here is the information rna and atac objects: