aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
439 stars 181 forks source link

Error when running modules = list(modules_from_adjacencies(adjacencies, ex_matrix)) #149

Closed morganee261 closed 4 years ago

morganee261 commented 4 years ago

Hello,

I am getting this error the running modules = list(modules_from_adjacencies(adjacencies, ex_matrix)) :

2020-03-23 13:46:47,461 - pyscenic.utils - INFO - Calculating Pearson correlations.

2020-03-23 13:46:47,461 - pyscenic.utils - WARNING - Note on correlation calculation: the default behaviour for calculating the correlations has changed after pySCENIC verion 0.9.16. Previously, the default was to calculate the correlation between a TF and target gene using only cells with non-zero expression values (mask_dropouts=True). The current default is now to use all cells to match the behavior of the R verision of SCENIC. The original settings can be retained by setting 'rho_mask_dropouts=True' in the modules_from_adjacencies function, or '--mask_dropouts' from the CLI. Dropout masking is currently set to [False]. Traceback (most recent call last): File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc return self._engine.get_loc(key) File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 128, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type KeyError: 'RPS19'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pyscenic/utils.py", line 268, in modules_from_adjacencies rho_threshold=rho_threshold, mask_dropouts=rho_mask_dropouts) File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pyscenic/utils.py", line 136, in add_correlation rhos = np.array([corr_mtx[s2][s1] for s1, s2 in zip(adjacencies.TF, adjacencies.target)]) File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pyscenic/utils.py", line 136, in rhos = np.array([corr_mtx[s2][s1] for s1, s2 in zip(adjacencies.TF, adjacencies.target)]) File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2980, in getitem indexer = self.columns.get_loc(key) File "/home/Morgane/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2899, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 128, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index_class_helper.pxi", line 91, in pandas._libs.index.Int64Engine._check_type KeyError: 'RPS19'

I have read other users errors and tried trouble shooting without any luck.

thanks for your help Morgane

cflerin commented 4 years ago

Hi,

This issue looks similar to #103 , but I'm not sure if it was every really solved. Perhaps you could try running this step via the CLI?

morganee261 commented 4 years ago

Hello,

Do you have a tutorial for the CLI? Can I start with the expression matrix filtered fro the R package or do I need to run correlation first and then export it?

thanks for your answer Morgane

cflerin commented 4 years ago

From this tutorial (search for "cisTarget"), you can see the usage of the command line:

pyscenic ctx adj.tsv \
    {f_db_names} \
    --annotations_fname {f_motif_path} \
    --expression_mtx_fname {f_loom_path_scenic} \
    --output reg.csv \
    --num_workers 20

The correlation step is baked into this already, but you may need to write out your grn output (adj.tsv) to a text file if you're running from Jupyter.

morganee261 commented 4 years ago

Thanks for your help, I had to re run the first part to get the adjacencies file in the correct form. I started the second part to get regulons 2 days ago and it is still running. do you have an idea of how long it might take? my matrix is about 40,000 cells and more than 22,000 genes. Morgane

cflerin commented 4 years ago

@morganee261 , sorry for the delayed reply. It sounds like you fixed this particular problem so I'll close this