aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
424 stars 179 forks source link

Scenic is not compatible with snakemake #523

Closed wangjiawen2013 closed 6 months ago

wangjiawen2013 commented 8 months ago

Hi,

Snakemake is also a workflow management system. Snakemake is highly popular, with >10 new citations per week. For an introduction, please visit https://snakemake.github.io/.

Though there are two Nextflow implementations available for scenic, I found that pyscenic can cause an error "NameError: name 'snakemake' is not defined" when running with Snakemake. Is it related to numba.jit decorator ? Could you provide Snakemake implementations for pyscenic ?

image

ghuls commented 8 months ago

It is not related to nopython. If you want to run pyscenic via snakemake, call the command line version of pySCENIC from your snakemake file.

wangjiawen2013 commented 8 months ago

While Snakemake also allows you to directly write Python code inside a rule, it is usually reasonable to move such logic into separate scripts. For this purpose, Snakemake offers the script directive. (adopted from https://snakemake.readthedocs.io/en/stable/tutorial/basics.html)

I can run python version of pySCENIC inside snakemake rule successfully, but cannot run it using separate script. I didn't use command line version of pySCENIC because I want to output some customized plots generated by python plotting code. In this case, run python version of pySCENIC inside snakemake rule is more convinient.

ghuls commented 8 months ago

without the code of pyscenic_module.py it is hard to say anything. But probably you will be better of with running pySCENIC in command line mode and read the output files later and then do your custom plotting in the next step.

wangjiawen2013 commented 8 months ago

here is the rule:

rule pyscenic_module:
    input:
        adata ="scvi_integration/integration.h5ad"
    output:
        df_motifs = "pyscenic_module/df_motifs.csv",
        regulon_pkl = "pyscenic_module/integration_regulons.p",
        regulon_df = "pyscenic_module/integration_regulons.xlsx",
        auc_mtx = "pyscenic_module/integration_auc_mtx.csv",
        regulons_umap = "pyscenic_module/umap_integration_regulons.pdf",
        adata = "pyscenic_module/integration.h5ad",
        rss = "pyscenic_module/integration_rss/rss.csv",
        clustermap = "pyscenic_module/integration_clustermap.pdf",
        cluster_rows = "pyscenic_module/integration_clustermap_rows.csv",
        cluster_columns = "pyscenic_module/integration_clustermap_columns.csv"
    params:
        sc_plot = "pyscenic_module",
        rss_plot = "pyscenic_module/integration_rss/"
    threads: 10
    script:
        "scripts/pyscenic_module.py"
ghuls commented 8 months ago

I see the problem now. pySCENIC is using multithreading. For multithreading, your code needs to importable (basically inside a function). Not sure if the magic snakemake can be passed. Maybe if you pass the snakemake as argument to a function.

wangjiawen2013 commented 7 months ago

This issue has been solved. https://github.com/snakemake/snakemake/issues/2678#issuecomment-1960903541