aertslab / SCENIC

SCENIC is an R package to infer Gene Regulatory Networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
413 stars 94 forks source link

AssertionError: Signatures dataframe is empty! #189

Open JiaweiDai-create opened 3 years ago

JiaweiDai-create commented 3 years ago

Hi, Thank you for developing SCENIC. I'm a new user of pyscenic. When I tried to rerun the case study "GSE103322 - Head and Neck Squamous Cell Carcinoma (HNSC)", an unexpected problem occured. When I ran "regulons = derive_regulons(df_motifs)" at STEP 4 , I got an error:

>>> regulons = derive_regulons(df_motifs)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 9, in derive_regulons
  File "/export/home/daijiawei/software/Anaconda/anaconda3/envs/pyscenic/lib/python3.7/site-packages/pyscenic/transform.py", line 311, in df2regulons
    assert not df.empty, 'Signatures dataframe is empty!'
AssertionError: Signatures dataframe is empty!

Then I checked the function deriveregulons(), I found the problem was happened in the following code: `motifs = motifs[np.fromiter(map(compose(op.not, contains('weight>50.0%')), df_motifs.Context), dtype=np.bool) & np.fromiter(map(contains(*db_names), df_motifs.Context), dtype=np.bool) & np.fromiter(map(contains('activating'), df_motifs.Context), dtype=np.bool)]` It returned:

>>> motifs
Empty DataFrame
Columns: [AUC, NES, MotifSimilarityQvalue, OrthologousIdentity, Annotation, Context, TargetGenes, RankAtMax]
Index: []

I'm confused because I just completely repeated the code in the tutorial. And it seems that df_motifs is not abnormal:

>>> df_motifs
                                                            AUC  ...  RankAtMax
TF     MotifID                                                   ...           
ARNT   flyfactorsurvey__HLH106_SANGER_5_2_FBgn0015234  0.047117  ...        403
       cisbp__M5866                                    0.049741  ...        456
       flyfactorsurvey__tgo_cyc_SANGER_5_FBgn0015014   0.047026  ...       1209
       cisbp__M5290                                    0.047038  ...        335
       taipale_cyt_meth__BHLHE40_GTCACGTGAC_eDBD       0.050189  ...        635
...                                                         ...  ...        ...
ZNF506 transfac_pro__M06302                            0.065183  ...       1119
ZNF76  transfac_pro__M04732                            0.057430  ...       1160
       transfac_pro__M04682                            0.057910  ...        950
       dbcorrdb__SIX5__ENCSR000BJE_1__m2               0.054720  ...       2660
ZNF8   c2h2_zfs__M3875                                 0.048292  ...        833

[55418 rows x 8 columns]

Do you have any idea how I could fix this issue? Would be greteful for your kind help!

Besies, I also used the docker image of SCENIC to analyse PMBC and finally I got a file named auc_mtx.csv. I wonder what should I do next to plot interpretable figures like heatmap.

Best, Jiawei

noorisotoudeh commented 3 years ago

just in case you still didn't figure it out, you should simply edit db_names in STEP 4: Cellular enrichment aka AUCell from def derive_regulons(motifs, db_names=('hg19-tss-centered-10kb-10species', 'hg19-500bp-upstream-10species', 'hg19-tss-centered-5kb-10species')):

to

def derive_regulons(motifs, db_names=('hg19-tss-centered-10kb-10species.mc9nr', 'hg19-500bp-upstream-10species.mc9nr', 'hg19-tss-centered-5kb-10species.mc9nr')):

thanks,

Noori

JiaweiDai-create commented 3 years ago

Thank you, Noori. I have realized this problem and run this code successfully.

Best, Jiawei