aertslab / scenicplus

SCENIC+ is a python package to build gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and single-cell chromatin accessibility (scATAC-seq) data.
Other
185 stars 29 forks source link

Problem with TSS #474

Open ReviewBrandLab2024 opened 1 month ago

ReviewBrandLab2024 commented 1 month ago

Hello,

Thank you for all your help. We didn't go through any problem since last time untill now with the following command:

!mkdir -p /staging/tur/Erythroid_Final/outs/qc
!pycistopic tss get_tss \
    --output /staging/tur/Erythroid_Final/outs/qc/tss.bed \
    --name "hsapiens_gene_ensembl" \
    --to-chrom-source ucsc \
    --ucsc hg38

Here is the error: (scenicplus) [tur@ap2001 qc]

$ pycistopic tss gene_annotation_list | grep Human
Traceback (most recent call last):

  File "/home/tur/miniconda3/envs/scenicplus/bin/pycistopic", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main
    args.func(args)
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 479, in run_tss_gene_annotation_list
    get_species_gene_annotation_ensembl_biomart_dataset_names(
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 299, in get_species_gene_annotation_ensembl_biomart_dataset_names
    biomart_datasets = ga.get_all_gene_annotation_ensembl_biomart_dataset_names(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/gene_annotation.py", line 56, in get_all_gene_annotation_ensembl_biomart_dataset_names
    import pybiomart as pbm
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/__init__.py", line 3, in <module>
    from .server import Server
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/server.py", line 12, in <module>
    from .base import ServerBase
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/base.py", line 15, in <module>
    requests_cache.install_cache('.pybiomart')
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/patcher.py", line 41, in install_cache
    backend = init_backend(cache_name, backend, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/__init__.py", line 91, in init_backend
    return BACKEND_CLASSES[backend](cache_name, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 61, in __init__
    self.responses: SQLiteDict = SQLiteDict(db_path, table_name='responses', **skwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 225, in __init__
    self.init_db()
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 230, in init_db
    with self.connection(commit=True) as con:
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 252, in connection
    self._connection = sqlite3.connect(self.db_path, **self.connection_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: unable to open database file
(scenicplus) [tur@ap2001 qc]$ !mkdir -p outs/qc
!pycistopic tss get_tss \
    --output outs/qc/tss.bed \
    --name "hsapiens_gene_ensembl" \
    --to-chrom-source ucsc \
    --ucsc hg38
mkdir scRNA -p outs/qc
mkdir: cannot create directory ‘scRNA’: Disk quota exceeded
mkdir: cannot create directory ‘outs’: Disk quota exceeded
pycistopic tss gene_annotation_list | grep Human tss get_tss \
grep: unrecognized option '--output'
Usage: grep [OPTION]... PATTERNS [FILE]...
Try 'grep --help' for more information.
Traceback (most recent call last):
  File "/home/tur/miniconda3/envs/scenicplus/bin/pycistopic", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/pycistopic.py", line 26, in main
    args.func(args)
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 479, in run_tss_gene_annotation_list
    get_species_gene_annotation_ensembl_biomart_dataset_names(
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/cli/subcommand/tss.py", line 299, in get_species_gene_annotation_ensembl_biomart_dataset_names
    biomart_datasets = ga.get_all_gene_annotation_ensembl_biomart_dataset_names(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pycisTopic/gene_annotation.py", line 56, in get_all_gene_annotation_ensembl_biomart_dataset_names
    import pybiomart as pbm
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/__init__.py", line 3, in <module>
    from .server import Server
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/server.py", line 12, in <module>
    from .base import ServerBase
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pybiomart/base.py", line 15, in <module>
    requests_cache.install_cache('.pybiomart')
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/patcher.py", line 41, in install_cache
    backend = init_backend(cache_name, backend, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/__init__.py", line 91, in init_backend
    return BACKEND_CLASSES[backend](cache_name, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 61, in __init__
    self.responses: SQLiteDict = SQLiteDict(db_path, table_name='responses', **skwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 225, in __init__
    self.init_db()
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 230, in init_db
    with self.connection(commit=True) as con:
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/tur/miniconda3/envs/scenicplus/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 252, in connection
    self._connection = sqlite3.connect(self.db_path, **self.connection_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: unable to open database file

Do we need to update the command or something else?

Best,

Steven

SeppeDeWinter commented 1 month ago

Hi @ReviewBrandLab2024

This looks like a lot of errors, can you run


pycistopic tss gene_annotation_list

in a bash (zsh, ... not python) kernel and show the output please?

All the best,

Seppe

ghuls commented 5 days ago

@ReviewBrandLab2024 It is likely that BioMart had some problems (sometimes it seems to be overloaded) when you ran your original command. The request library used by pybiomart caches requests by default, and cached an incomplete or failed response. You can delete the hidden cache file .pybiomart.sqlite in the directories where you ran the command, to removed the cached response.

Or you can run with --no-cache option.

ghuls commented 5 days ago

The last 2 commits in the polars_1xx branch of pycisTopic will give more info in the future, when it detects problems likely related to failed cached BioMart requests: https://github.com/aertslab/pycisTopic/pull/149

$ pycistopic tss gene_annotation_list
Error: Could not get gene annotation Ensembl BioMart dataset names. Likely this is caused by and invalid/incomplete cached request from BioMart.

Use "--no-cache" or remove ".pybiomart.sqlite" in the current working directory and try again.

Traceback (most recent call last):
  File "/software/anaconda3/envs/pycistopic_3.11/bin/pycistopic", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/PycharmProjects/pycisTopic/src/pycisTopic/cli/pycistopic.py", line 26, in main
    args.func(args)
  File "/PycharmProjects/pycisTopic/src/pycisTopic/cli/subcommand/tss.py", line 502, in run_tss_gene_annotation_list
    get_species_gene_annotation_ensembl_biomart_dataset_names(
  File "/PycharmProjects/pycisTopic/src/pycisTopic/cli/subcommand/tss.py", line 325, in get_species_gene_annotation_ensembl_biomart_dataset_names
    raise e
  File "/PycharmProjects/pycisTopic/src/pycisTopic/cli/subcommand/tss.py", line 312, in get_species_gene_annotation_ensembl_biomart_dataset_names
    biomart_datasets = ga.get_all_gene_annotation_ensembl_biomart_dataset_names(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/PycharmProjects/pycisTopic/src/pycisTopic/gene_annotation.py", line 59, in get_all_gene_annotation_ensembl_biomart_dataset_names
    biomart = biomart_server["ENSEMBL_MART_ENSEMBL"]
              ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/pybiomart/server.py", line 55, in __getitem__
    return self.marts[name]
           ^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/pybiomart/server.py", line 61, in marts
    self._marts = self._fetch_marts()
                  ^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/pybiomart/server.py", line 79, in _fetch_marts
    response = self.get(type='registry')
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/pybiomart/base.py", line 107, in get
    r = requests.get(self.url, params=params)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests_cache/session.py", line 183, in request
    return super().request(method, url, *args, headers=headers, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests_cache/session.py", line 218, in send
    cached_response = self.cache.get_response(actions.cache_key)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests_cache/backends/base.py", line 77, in get_response
    response = self.responses.get(key)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen _collections_abc>", line 774, in get
  File "/software/anaconda3/envs/pycistopic_3.11/lib/python3.11/site-packages/requests_cache/backends/sqlite.py", line 312, in __getitem__
    cur = con.execute(f'SELECT value FROM {self.table_name} WHERE key=?', (key,))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.DatabaseError: database disk image is malformed