saezlab / decoupler-py

Python package to perform enrichment analysis from omics data.
https://decoupler-py.readthedocs.io/
GNU General Public License v3.0
166 stars 25 forks source link

dc.get_collectri and dc.get_dorothea fails with errors in the Google Colab #71

Closed hengkp closed 1 year ago

hengkp commented 1 year ago

Describe the bug I used Google Colab to run the decoupler-py pipeline, which I can successfully retrieve the data frame of both GRN networks (collectri and dorothea) in the last week. I returned to run it again and got this error message in the step 3 and 4 below.

  1. Installing libraries
    # Install necessary libraries
    %pip install scanpy
    %pip install decoupler
    %pip install pydeseq2
    %pip install adjustText
    %pip install omnipath
  2. Importing libraries
    
    # Import Necessary libraries
    import scanpy as sc
    import decoupler as dc
    import omnipath as op

Data Manipulation

import numpy as np import pandas as pd from anndata import AnnData

Data Visualization

import seaborn as sns import matplotlib.pyplot as plt

Differential Expression Analysis (DESeq2)

from pydeseq2.dds import DeseqDataSet from pydeseq2.ds import DeseqStats

import warnings warnings.filterwarnings("ignore")


3. Get CollecTRI network

Retrieve CollecTRI gene regulatory network

collectri = dc.get_collectri(organism='human', split_complexes=False) collectri


> ---------------------------------------------------------------------------
> ValueError                                Traceback (most recent call last)
> [<ipython-input-8-5ac8077dc22a>](https://localhost:8080/#) in <cell line: 2>()
>       1 # Retrieve CollecTRI gene regulatory network
> ----> 2 collectri = dc.get_collectri(organism='human', split_complexes=False, ssl=False)
>       3 collectri
> 
> 10 frames
> [/usr/local/lib/python3.10/dist-packages/decoupler/omnip.py](https://localhost:8080/#) in get_collectri(organism, split_complexes, **kwargs)
>     423 
>     424     # Load collectri
> --> 425     ct = op.interactions.CollecTRI.get(genesymbols=True, organism=_organism, loops=True, **kwargs)
>     426     if _organism == 'human':
>     427         mirna = op.interactions.TFmiRNA.get(genesymbols=True, databases=['CollecTRI'], strict_evidences=True)
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_utils.py](https://localhost:8080/#) in wrapper(wrapped, _instance, args, kwargs)
>     112     @wrapt.decorator(adapter=wrapt.adapter_factory(argspec_factory))
>     113     def wrapper(wrapped, _instance, args, kwargs):
> --> 114         return wrapped(*args, **kwargs)
>     115 
>     116     from_class = hasattr(clazz, "get") and not hasattr(clazz.get, "__wrapped__")
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_utils.py](https://localhost:8080/#) in _get_helper(cls, **kwargs)
>      29         The result which depends the type of the request and the supplied parameters.
>      30     """
> ---> 31     return cls()._get(**kwargs)
>      32 
>      33 
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_request.py](https://localhost:8080/#) in _get(self, **kwargs)
>     106         kwargs = self._inject_fields(kwargs)
>     107         kwargs, callback = self._convert_params(kwargs)
> --> 108         kwargs = self._validate_params(kwargs)
>     109         kwargs = self._finalize_params(kwargs)
>     110         self._last_param["final"] = kwargs.copy()
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_request.py](https://localhost:8080/#) in _validate_params(self, params)
>     193         for k, v in params.items():
>     194             # first get the validator for the parameter, then validate
> --> 195             res[self._query_type(k).param] = self._query_type(k)(v)
>     196         return res
>     197 
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/query/_query.py](https://localhost:8080/#) in __call__(self, value)
>     134     ) -> Optional[Set[str]]:
>     135         """%(validate)s"""  # noqa: D401
> --> 136         return self.value(value)
>     137 
>     138     @property
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/constants/_constants.py](https://localhost:8080/#) in __call__(cls, *args, **kw)
>      49                 f"without `__error_format__` class attribute."
>      50             )
> ---> 51         return super().__call__(*args, **kw)
>      52 
>      53     def __new__(cls, clsname, superclasses, attributedict):  # noqa: D102
> 
> [/usr/lib/python3.10/enum.py](https://localhost:8080/#) in __call__(cls, value, names, module, qualname, type, start)
>     383         """
>     384         if names is None:  # simple value lookup
> --> 385             return cls.__new__(cls, value)
>     386         # otherwise, functional API: we're creating a new Enum type
>     387         return cls._create_(
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/constants/_constants.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
>      13             _cls, value, *_ = args
>      14             e.args = (cls._format(value),)
> ---> 15             raise e
>      16 
>      17     if not issubclass(cls, ErrorFormatter):
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/constants/_constants.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
>       9     def wrapper(*args, **kwargs) -> Enum:
>      10         try:
> ---> 11             return fun(*args, **kwargs)
>      12         except ValueError as e:
>      13             _cls, value, *_ = args
> 
> [/usr/lib/python3.10/enum.py](https://localhost:8080/#) in __new__(cls, value)
>     708                 ve_exc = ValueError("%r is not a valid %s" % (value, cls.__qualname__))
>     709                 if result is None and exc is None:
> --> 710                     raise ve_exc
>     711                 elif exc is None:
>     712                     exc = TypeError(
> 
> ValueError: Invalid value `loops` for `InteractionsQuery`. Valid options are: `['database', 'databases', 'dataset', 'datasets', 'directed', 'directeds', 'dorothea_level', 'dorothea_levels', 'dorothea_method', 'dorothea_methods', 'entity_type', 'entity_types', 'field', 'fields', 'format', 'formats', 'genesymbol', 'genesymbols', 'header', 'headers', 'license', 'licenses', 'limit', 'limits', 'organism', 'organisms', 'partner', 'partners', 'password', 'passwords', 'resource', 'resources', 'signed', 'signeds', 'source_target', 'source_targets', 'source', 'sources', 'target', 'targets', 'tfregulons_level', 'tfregulons_levels', 'tfregulons_method', 'tfregulons_methods', 'type', 'types']`.

4. Get DoRothEA network

Retrieve DoRothEA gene regulatory network

dorothea = dc.get_dorothea(organism='human') dorothea



> WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=5, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)'))': /interactions?datasets=dorothea&dorothea_levels=A%2CB%2CC%2CD&fields=curation_effort%2Cdorothea_level%2Cevidences%2Cextra_attrs%2Creferences%2Csources&format=tsv&genesymbols=1&organisms=9606
> WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=5, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)'))': /interactions?datasets=dorothea&dorothea_levels=A%2CB%2CC%2CD&fields=curation_effort%2Cdorothea_level%2Cevidences%2Cextra_attrs%2Creferences%2Csources&format=tsv&genesymbols=1&organisms=9606
> WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=5, status=None)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)'))': /interactions?datasets=dorothea&dorothea_levels=A%2CB%2CC%2CD&fields=curation_effort%2Cdorothea_level%2Cevidences%2Cextra_attrs%2Creferences%2Csources&format=tsv&genesymbols=1&organisms=9606
> ---------------------------------------------------------------------------
> SSLCertVerificationError                  Traceback (most recent call last)
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in _make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
>     466             try:
> --> 467                 self._validate_conn(conn)
>     468             except (SocketTimeout, BaseSSLError) as e:
> 
> 24 frames
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in _validate_conn(self, conn)
>    1091         if conn.is_closed:
> -> 1092             conn.connect()
>    1093 
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connection.py](https://localhost:8080/#) in connect(self)
>     641 
> --> 642         sock_and_verified = _ssl_wrap_socket_and_match_hostname(
>     643             sock=sock,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connection.py](https://localhost:8080/#) in _ssl_wrap_socket_and_match_hostname(sock, cert_reqs, ssl_version, ssl_minimum_version, ssl_maximum_version, cert_file, key_file, key_password, ca_certs, ca_cert_dir, ca_cert_data, assert_hostname, assert_fingerprint, server_hostname, ssl_context, tls_in_tls)
>     782 
> --> 783     ssl_sock = ssl_wrap_socket(
>     784         sock=sock,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/util/ssl_.py](https://localhost:8080/#) in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
>     468 
> --> 469     ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
>     470     return ssl_sock
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/util/ssl_.py](https://localhost:8080/#) in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
>     512 
> --> 513     return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
> 
> [/usr/lib/python3.10/ssl.py](https://localhost:8080/#) in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
>     512         # ctx._wrap_socket()
> --> 513         return self.sslsocket_class._create(
>     514             sock=sock,
> 
> [/usr/lib/python3.10/ssl.py](https://localhost:8080/#) in _create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
>    1070                         raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
> -> 1071                     self.do_handshake()
>    1072             except (OSError, ValueError):
> 
> [/usr/lib/python3.10/ssl.py](https://localhost:8080/#) in do_handshake(self, block)
>    1341                 self.settimeout(None)
> -> 1342             self._sslobj.do_handshake()
>    1343         finally:
> 
> SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)
> 
> During handling of the above exception, another exception occurred:
> 
> SSLError                                  Traceback (most recent call last)
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
>     789             # Make the request on the HTTPConnection object
> --> 790             response = self._make_request(
>     791                 conn,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in _make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
>     490                 new_e = _wrap_proxy_error(new_e, conn.proxy.scheme)
> --> 491             raise new_e
>     492 
> 
> SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)
> 
> The above exception was the direct cause of the following exception:
> 
> MaxRetryError                             Traceback (most recent call last)
> [/usr/local/lib/python3.10/dist-packages/requests/adapters.py](https://localhost:8080/#) in send(self, request, stream, timeout, verify, cert, proxies)
>     485         try:
> --> 486             resp = conn.urlopen(
>     487                 method=request.method,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
>     873             )
> --> 874             return self.urlopen(
>     875                 method,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
>     873             )
> --> 874             return self.urlopen(
>     875                 method,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
>     873             )
> --> 874             return self.urlopen(
>     875                 method,
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py](https://localhost:8080/#) in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
>     843 
> --> 844             retries = retries.increment(
>     845                 method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
> 
> [/usr/local/lib/python3.10/dist-packages/urllib3/util/retry.py](https://localhost:8080/#) in increment(self, method, url, response, error, _pool, _stacktrace)
>     514             reason = error or ResponseError(cause)
> --> 515             raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
>     516 
> 
> MaxRetryError: HTTPSConnectionPool(host='omnipathdb.org', port=443): Max retries exceeded with url: /interactions?datasets=dorothea&dorothea_levels=A%2CB%2CC%2CD&fields=curation_effort%2Cdorothea_level%2Cevidences%2Cextra_attrs%2Creferences%2Csources&format=tsv&genesymbols=1&organisms=9606 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)')))
> 
> During handling of the above exception, another exception occurred:
> 
> SSLError                                  Traceback (most recent call last)
> [<ipython-input-9-e85bf64b66b8>](https://localhost:8080/#) in <cell line: 2>()
>       1 # Retrieve DoRothEA gene regulatory network
> ----> 2 dorothea = dc.get_dorothea(organism='human', ssl=False)
>       3 dorothea
> 
> [/usr/local/lib/python3.10/dist-packages/decoupler/omnip.py](https://localhost:8080/#) in get_dorothea(organism, levels, weight_dict, **kwargs)
>     305 
>     306     # Load Dorothea
> --> 307     do = op.interactions.Dorothea.get(
>     308         fields=['dorothea_level', 'extra_attrs'],
>     309         dorothea_levels=['A', 'B', 'C', 'D'],
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_utils.py](https://localhost:8080/#) in wrapper(wrapped, _instance, args, kwargs)
>     112     @wrapt.decorator(adapter=wrapt.adapter_factory(argspec_factory))
>     113     def wrapper(wrapped, _instance, args, kwargs):
> --> 114         return wrapped(*args, **kwargs)
>     115 
>     116     from_class = hasattr(clazz, "get") and not hasattr(clazz.get, "__wrapped__")
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_utils.py](https://localhost:8080/#) in _get_helper(cls, **kwargs)
>      29         The result which depends the type of the request and the supplied parameters.
>      30     """
> ---> 31     return cls()._get(**kwargs)
>      32 
>      33 
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/requests/_request.py](https://localhost:8080/#) in _get(self, **kwargs)
>     110         self._last_param["final"] = kwargs.copy()
>     111 
> --> 112         res = self._downloader.maybe_download(
>     113             self._query_type.endpoint, params=kwargs, callback=callback, is_final=False
>     114         )
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/downloader/_downloader.py](https://localhost:8080/#) in maybe_download(self, url, callback, params, cache, is_final, **_)
>     125             res = self._options.cache[key]
>     126         else:
> --> 127             res = callback(self._download(req))
>     128             if cache:
>     129                 logging.debug(f"Caching result to `{self._options.cache}[{key!r}]`")
> 
> [/usr/local/lib/python3.10/dist-packages/omnipath/_core/downloader/_downloader.py](https://localhost:8080/#) in _download(self, req)
>     151 
>     152         handle = BytesIO()
> --> 153         with self._session.send(
>     154             req, stream=True, timeout=self._options.timeout
>     155         ) as resp:
> 
> [/usr/local/lib/python3.10/dist-packages/requests/sessions.py](https://localhost:8080/#) in send(self, request, **kwargs)
>     701 
>     702         # Send the request
> --> 703         r = adapter.send(request, **kwargs)
>     704 
>     705         # Total elapsed time of the request (approximately)
> 
> [/usr/local/lib/python3.10/dist-packages/requests/adapters.py](https://localhost:8080/#) in send(self, request, stream, timeout, verify, cert, proxies)
>     515             if isinstance(e.reason, _SSLError):
>     516                 # This branch is for urllib3 v1.22 and later.
> --> 517                 raise SSLError(e, request=request)
>     518 
>     519             raise ConnectionError(e, request=request)
> 
> SSLError: HTTPSConnectionPool(host='omnipathdb.org', port=443): Max retries exceeded with url: /interactions?datasets=dorothea&dorothea_levels=A%2CB%2CC%2CD&fields=curation_effort%2Cdorothea_level%2Cevidences%2Cextra_attrs%2Creferences%2Csources&format=tsv&genesymbols=1&organisms=9606 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1007)')))

**To Reproduce**
Steps to reproduce the behavior.

1. Start the new Google Colab file
2. Write the code in the box (4 steps above)
3. Run the cells on each box

**Expected behavior**
I can succesfully retrieve the dataframe of both GRN networks using the Google Colab environment.

**System**
 - OS: Ubuntu 22.04.2 LTS
 - Python version 3.10.12
PauBadiaM commented 1 year ago

HI @hengkp,

Unfortunately, it seems like the omnipath server is down currently. @deeenes could you have a look?

hengkp commented 1 year ago

Hi @PauBadiaM,

Thank you for your information. When will it be available? And, do you have an alternative way to retrieve the GRN networks?

Heng

deeenes commented 1 year ago

Hi,

The server worked fine until just about an hour ago, when our certificates expired. Actually the server is up and running, but the clients obviously throw an SSL error. The reason why the certificate renewal failed is that the nginx process for some reason could not be stopped, hence certbot could not bind to the 80 & 443 ports. When trying to fix this issue, I exceeded the rate limits of Let's Encrypt, and I'm now at this point, I'm waiting one hour to pass for the next attempt to renew the certificates. Until then, the server runs with expired certificates, presenting an SSL error in most of the clients and browsers. This issue impacts all the domains on the certificate, including our group web page saezlab.org. I hope the issue will be fixed within one hour. Apologies for the inconvenience.

The server works on the plain http domains (http://no-tls.omnipathdb.org/, http://no-tls.static.omnipathdb.org/). It's not clear for me how to change the URL in the omnipath Python module. We're working on a feature to make it fall back to those non-https domains automatically, addressing most of the TLS related issues.

Best,

Denes

deeenes commented 1 year ago

I successfully renewed the certificates, it should work from now on.

hengkp commented 1 year ago

Hi @deeenes,

It works now for me. I appreciate your support.

Heng

PattF commented 1 year ago

Hi, I'm still having the same issue, I can't get past the first section of: net = dc.get_collectri(organism='human', split_complexes=False) net without receiving the following error: ValueError: Invalid value loops for InteractionsQuery. Valid options are: ['database', 'databases', 'dataset', 'datasets', 'directed', 'directeds', 'dorothea_level', 'dorothea_levels', 'dorothea_method', 'dorothea_methods', 'entity_type', 'entity_types', 'field', 'fields', 'format', 'formats', 'genesymbol', 'genesymbols', 'header', 'headers', 'license', 'licenses', 'limit', 'limits', 'organism', 'organisms', 'partner', 'partners', 'password', 'passwords', 'resource', 'resources', 'signed', 'signeds', 'source_target', 'source_targets', 'source', 'sources', 'target', 'targets', 'tfregulons_level', 'tfregulons_levels', 'tfregulons_method', 'tfregulons_methods', 'type', 'types'].