aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
432 stars 180 forks source link

[BUG]“pyscenic ctx” cant not load “.feather” correctly #420

Open JRZL123 opened 2 years ago

JRZL123 commented 2 years ago

Dear pyscenic development team, I encountered some problems when using “pyscenic ctx”

Describe the bug

"mm9-500bp-upstream-10species.mc9nr.genes_vsmotifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format. when I use function “pyscenic ctx”. ".feather" download Via zsync curl and checksum matches are OK

Reproduce the behavior

  1. Command run when the error occurred:

    pyscenic ctx Astrocyte.tsv mm9-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather mm9-tss-centered-10kb-10species.mc9nr.genes_vs_motifs.rankings.feather --annotations_fname motifs-v9-nr.mgi-m0.001-o0.0.tbl --expression_mtx_fname Astrocyte.loom --mode "dask_multiprocessing" --output Astrocyte_reg.csv --num_workers 4 --mask_dropouts
  2. Error encountered:

    
    2022-08-29 11:11:22,243 - pyscenic.cli.pyscenic - INFO - Creating modules.

2022-08-29 11:11:22,613 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2022-08-29 11:11:22,867 - pyscenic.utils - INFO - Calculating Pearson correlations.

2022-08-29 11:11:22,987 - pyscenic.utils - WARNING - Note on correlation calculation: the default behaviour for calculating the correlations has changed after pySCENIC verion 0.9.16. Previously, the default was to calculate the correlation between a TF and target gene using only cells with non-zero expression values (mask_dropouts=True). The current default is now to use all cells to match the behavior of the R verision of SCENIC. The original settings can be retained by setting 'rho_mask_dropouts=True' in the modules_from_adjacencies function, or '--mask_dropouts' from the CLI. Dropout masking is currently set to [True].

2022-08-29 11:11:25,281 - pyscenic.utils - INFO - Creating modules.

2022-08-29 11:11:50,503 - pyscenic.cli.pyscenic - INFO - Loading databases. Traceback (most recent call last): File "/home/lhz197104/miniconda3/bin/pyscenic", line 8, in sys.exit(main()) File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 677, in main args.func(args) File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 215, in prune_targets_command dbs = _load_dbs(args.database_fname) File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 176, in _load_dbs return [opendb(fname=fname.name, name=get_name(fname.name)) for fname in fnames] File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 176, in return [opendb(fname=fname.name, name=get_name(fname.name)) for fname in fnames] File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/rnkdb.py", line 180, in opendb return FeatherRankingDatabase(fname, name=name) File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/rnkdb.py", line 109, in init self.ct_db = CisTargetDatabase.init_ct_db( File "/home/lhz197104/miniconda3/lib/python3.8/site-packages/ctxcore/ctdb.py", line 170, in init_ct_db raise ValueError( ValueError: "mm9-500bp-upstream-10species.mc9nr.genes_vs_motifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format.


**Expected behavior**
Did I do something wrong? How to solve this problem?

**Please complete the following information:**
- pySCENIC version: [0.12.0]
- Installation method: [Conda]
- Run environment: [CLI ]
- OS: [WSL2 Ubuntu 20.04.4]
- Package versions: 

aiohttp==3.8.1 aiosignal==1.2.0 arboreto==0.1.6 async-timeout==4.0.2 attrs==22.1.0 bokeh==2.4.3 boltons==21.0.0 Bottleneck @ file:///tmp/build/80754af9/bottleneck_1648028895253/work brotlipy==0.7.0 certifi @ file:///opt/conda/conda-bld/certifi_1655968806487/work/certifi cffi @ file:///opt/conda/conda-bld/cffi_1642701102775/work charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work click @ file:///tmp/build/80754af9/click_1646038465422/work cloudpickle==2.1.0 colorama @ file:///tmp/build/80754af9/colorama_1607707115595/work conda==4.14.0 conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1649087926789/work cryptography @ file:///tmp/build/80754af9/cryptography_1639400846433/work ctxcore==0.2.0 cycler @ file:///tmp/build/80754af9/cycler_1637851556182/work Cython @ file:///tmp/build/80754af9/cython_1647832478439/work cytoolz==0.11.0 dask==2022.8.1 dill==0.3.5.1 distributed==2022.8.1 fonttools==4.25.0 frozendict==2.3.4 frozenlist==1.3.1 fsspec==2022.7.1 h5py==2.10.0 HeapDict==1.0.1 idna @ file:///tmp/build/80754af9/idna_1637925883363/work interlap==0.2.7 Jinja2==3.1.2 joblib @ file:///tmp/build/80754af9/joblib_1635411271373/work kiwisolver @ file:///opt/conda/conda-bld/kiwisolver_1653292039266/work llvmlite==0.38.0 locket==1.0.0 loompy==3.0.7 MarkupSafe==2.1.1 matplotlib @ file:///tmp/build/80754af9/matplotlib-suite_1647441664166/work mkl-fft==1.3.1 mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work mkl-service==2.4.0 mock @ file:///tmp/build/80754af9/mock_1607622725907/work msgpack==1.0.4 multidict==6.0.2 multiprocessing-on-dill==3.5.0a4 networkx==2.8.6 numba @ file:///opt/conda/conda-bld/numba_1648040517072/work numexpr @ file:///tmp/build/80754af9/numexpr_1640704208950/work numpy @ file:///opt/conda/conda-bld/numpy_and_numpy_base_1651563629415/work numpy-groupies==0.9.19 packaging @ file:///tmp/build/80754af9/packaging_1637314298585/work pandas==1.4.3 partd==1.3.0 patsy==0.5.2 Pillow==9.0.1 psutil==5.9.1 pyarrow==9.0.0 pycosat==0.6.3 pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work pynndescent==0.5.7 pyOpenSSL @ file:///opt/conda/conda-bld/pyopenssl_1643788558760/work pyparsing @ file:///opt/conda/conda-bld/pyparsing_1661452539315/work pysam==0.19.1 pyscenic==0.12.0 PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work pytz==2022.2.1 PyYAML==6.0 requests @ file:///opt/conda/conda-bld/requests_1641824580448/work ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016699510/work scikit-learn @ file:///tmp/build/80754af9/scikit-learn_1642617107864/work scipy @ file:///tmp/build/80754af9/scipy_1641555001653/work seaborn @ file:///tmp/build/80754af9/seaborn_1629307859561/work sip==4.19.13 six @ file:///tmp/build/80754af9/six_1644875935023/work sortedcontainers==2.4.0 statsmodels @ file:///tmp/build/80754af9/statsmodels_1648033297787/work tables==3.6.1 tblib==1.7.0 threadpoolctl @ file:///Users/ktietz/demo/mc3/conda-bld/threadpoolctl_1629802263681/work toolz @ file:///tmp/build/80754af9/toolz_1636545406491/work tornado @ file:///tmp/build/80754af9/tornado_1606942300299/work tqdm @ file:///opt/conda/conda-bld/tqdm_1647339053476/work typing_extensions==4.3.0 umap-learn==0.5.3 urllib3 @ file:///opt/conda/conda-bld/urllib3_1643638302206/work velocyto==0.17.17 yarl==1.8.1 zict==2.2.0

ghuls commented 2 years ago

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

JRZL123 commented 2 years ago

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/<

Same problem, nothings change (ㄒoㄒ)

JRZL123 commented 2 years ago

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

pretty sure"mm9.feather" it's the problem. I used "mm10_10kbp_up_10kbp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" & "mm10_500bp_up_100bp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" run perfectly find! Now the only question is, if I use the "mm10.feather" to analyze on the count matrix obtained by "mm9", will there be serious consequences?

ghuls commented 2 years ago

Install pySCENIC 0.12.0: https://pypi.org/project/pyscenic/

pretty sure"mm9.feather" it's the problem. I used "mm10_10kbp_up_10kbp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" & "mm10_500bp_up_100bp_down_full_tx_clustered.genes_vs_motifs.rankings.feather" run perfectly find! Now the only question is, if I use the "mm10.feather" to analyze on the count matrix obtained by "mm9", will there be serious consequences?

Normally not a lot. It might even be better as the mm9 gene annotation used in the old database was quite old (so you might recover more genes from your count matrix).

AdiRavid commented 2 years ago

Hi, I'm having the same issue with the mm10 files, running pySCENIC 0.12.0.

ValueError: "mm10_10kbp_up_10kbp_down_full_tx_v10_clust.genes_vs_motifs.rankings.feather" is not a cisTarget Feather database in Feather v1 or v2 format.

Any more ideas?

ghuls commented 1 year ago

Can you redownload the file? The file didn't exist before (different name).

AdiRavid commented 1 year ago

Trying again, but the checksum file doesn't exist. sha256sum.txt