Closed jenelysRO closed 7 months ago
Hi @jenelysRO
This seems to be an error coming from your script scenic_plus_run.py
line 78.
File "scenic_plus_run.py", line 78, in <module>
raise(e)
I guess you have a try except statement there?
This should look like.
try:
<CODE>
except Exception as e:
raise(e)
Best,
Seppe
Thanks for your quick response! I ran the code without the try and Exception snippet and got rid of the error. However, now I run into another error:
2024-04-11 15:38:30,410 INFO worker.py:1715 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265
initializing: 0it [00:00, ?it/s]
Running using 8 cores: 0it [00:00, ?it/s]2024-04-11 15:38:33,028 R2G INFO Took 8.398192167282104 seconds
An error occured!
2024-04-11 15:38:33,079 SCENIC+_wrapper INFO Inferring TF to gene relationships
Traceback (most recent call last):
File "scenic_plus_run.py", line 63, in <module>
run_scenicplus(
File "/tools/python/3.8.18.GCC10/lib/python3.8/site-packages/scenicplus/wrappers/run_scenicplus.py", line 156, in run_scenicplus
calculate_TFs_to_genes_relationships(scplus_obj,
File "/tools/python/3.8.18.GCC10/lib/python3.8/site-packages/scenicplus/TF_to_gene.py", line 283, in calculate_TFs_to_genes_relationships
ex_matrix, gene_names, tf_names = _prepare_input(
File "/tools/python/3.8.18.GCC10/lib/python3.8/site-packages/arboreto/algo.py", line 229, in _prepare_input
raise ValueError('Intersection of gene_names and tf_names is empty.')
ValueError: Intersection of gene_names and tf_names is empty.
I checked my TF file and it is formatted similar to the one in the pbmc tutorial, and the scenic object looks alright too and has all the necessary slots. How could I fix this?
Hi @jenelysRO
First of all I would strongly recommend to use the development branch of SCENIC+ (I'm going to switch that branch to the new branch in the next week, so the code you are using now is going to get deprecated. Just as a heads up). See: https://github.com/aertslab/scenicplus/tree/development for more info.
Can you run the following code and show the output please:
from arboreto.utils import load_tf_names
tf_names = load_tf_names(<TF_FILE>)
print(tf_names[0:10])
gene_names = scplus_obj.gene_names
cell_names = scplus_obj.cell_names
ex_matrix = scplus_obj.X_EXP
print(gene_names[0:10])
print(len(set(tf_names) & set(gene_names)))
Best,
Seppe
I see, the genes in scplus_obj are in an "_index" column within scplus_obj.metadata_genes, resulting in the following:
>>> from arboreto.utils import load_tf_names
>>> tf_names = load_tf_names('/scratch/jruiz/mouse_tf_gerg.gsc.riken.jp.txt')
>>>
>>> print(tf_names[0:10])
['Adnp', 'Aebp1', 'Aebp2', 'Nr0b1', 'Ahr', 'Ahrr', 'Aip', 'Aire', 'Alx3', 'Alx4']
>>>
>>> gene_names = scplus_obj.gene_names
>>> cell_names = scplus_obj.cell_names
>>> ex_matrix = scplus_obj.X_EXP
>>>
>>> print(gene_names[0:10])
Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], dtype='object')
>>>
>>> print(len(set(tf_names) & set(gene_names)))
0
>>> scplus_obj.gene_names
Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
...
'24589', '24590', '24591', '24592', '24593', '24594', '24595', '24596',
'24597', '24598'],
dtype='object', length=24599)
>>> scplus_obj.metadata_genes
_index
0 Xkr4
1 Gm1992
2 Gm37381
3 Rp1
4 Mrpl15
... ...
24594 CAAA01165726.1
24595 Hashtag5
24596 Hashtag6
24597 Hashtag7
24598 Hashtag8
[24599 rows x 1 columns]
I can't edit the contents of scplus_obj.gene_names as scplus_obj.metadata_genes['_index'] directly. How would you suggest I fix this? Thanks in advance!
Just wanted to post the solution here! It seems like the error was caused by using Seurat rather than Scanpy to process the scRNA-seq data. When I created the scenicplus object with adata.raw.to_adata(), the .var slot contained rows as numbers and an "_index" column with the genes, and create_SCENICPLUS_object would take the row names as the genes in scplus_obj.gene_names. To fix this issue, I did the following:
test_data = adata.raw.to_adata()
test_data.var.columns = ['genes']
test_data.var.set_index('genes', inplace=True)
scplus_obj = create_SCENICPLUS_object(
GEX_anndata=test_data,
cisTopic_obj=cistopic_obj,
menr=menr)
scplus_obj.X_EXP = np.array(scplus_obj.X_EXP.todense())
Then I continued with running scenicplus and it worked!
Describe the bug Hi, I am trying to use SCENIC+ with mouse data. I ran through the entire pbmc tutorial with the wrapper function and got everything to work. When I get to run_scenicplus() with my own data though, I get an error right before inferring TF to gene relationships and then SCENIC+ stops running.
To Reproduce
Error output
Expected behavior For SCENIC+ to continue running
Version (please complete the following information):