volkamerlab / opencadd

A Python library for structural cheminformatics
https://opencadd.readthedocs.io
MIT License
89 stars 18 forks source link

KLIFS multiprocessing #74

Closed schallerdavid closed 3 years ago

schallerdavid commented 3 years ago

Hey there,

since my KinoML code is getting more and more hungry I already played a bit with multiprocessing. However, when using the multiprocessing from python I receive a bravado.http_future.RequestsFutureAdapterConnectionError from opencadd's KLIFS module. This only happens if more than 1 process is using the KLIFS module at the same time. Would there be a way to use the KLIFS module in a multiprocess fashion?

I attached a script (klifs_mp.zip) to reproduce the error. It only requires a current opencadd installation.

# run 1 job at the same time python klifs_mp.py 1

# run 2 jobs at the time python klifs_mp.py 2

Thanks in advance!

jaimergp commented 3 years ago

Is this related? https://github.com/Yelp/bravado/issues/471

dominiquesydow commented 3 years ago

@schallerdavid, I don't know why but I missed this issue. Is this resolved by now? It sounds like an issue we fixed some time ago: https://github.com/volkamerlab/opencadd/pull/68

If fixed, can you please close this issue?

schallerdavid commented 3 years ago

@dominiquesydow I can still observe this behavior. I get the following error message when running two processes in parallel with the above mentioned script:

  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 1344, in getresponse
    response.begin()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "<string>", line 3, in raise_from
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 1344, in getresponse
    response.begin()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
ssl.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2635)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2635)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='klifs.net', port=443): Max retries exceeded with url: /api/kinase_names (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2635)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='klifs.net', port=443): Max retries exceeded with url: /api/kinase_names (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2635)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 291, in _get_incoming_response
    inner_response = self.future.result(timeout=timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/requests_client.py", line 266, in result
    response = self.session.send(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 291, in _get_incoming_response
    inner_response = self.future.result(timeout=timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/requests_client.py", line 266, in result
    response = self.session.send(
requests.exceptions.SSLError: HTTPSConnectionPool(host='klifs.net', port=443): Max retries exceeded with url: /api/kinase_names (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2635)')))
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
requests.exceptions.SSLError: HTTPSConnectionPool(host='klifs.net', port=443): Max retries exceeded with url: /api/kinase_names (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2635)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "klifs_mp.py", line 10, in run_job
    structures = remote.structures.all_structures()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/opencadd/databases/klifs/remote.py", line 277, in all_structures
    kinases = kinases_remote.all_kinases()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/opencadd/databases/klifs/remote.py", line 91, in all_kinases
    self._client.Information.get_kinase_names(
  File "klifs_mp.py", line 10, in run_job
    structures = remote.structures.all_structures()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 239, in response
    six.reraise(*sys.exc_info())
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/opencadd/databases/klifs/remote.py", line 277, in all_structures
    kinases = kinases_remote.all_kinases()
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/six.py", line 719, in reraise
    raise value
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/opencadd/databases/klifs/remote.py", line 91, in all_kinases
    self._client.Information.get_kinase_names(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 197, in response
    incoming_response = self._get_incoming_response(timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 239, in response
    six.reraise(*sys.exc_info())
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 126, in wrapper
    self.future._raise_connection_error(exception)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/six.py", line 719, in reraise
    raise value
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 91, in _raise_connection_error
    self._raise_error(BravadoConnectionError, 'ConnectionError', exception)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 197, in response
    incoming_response = self._get_incoming_response(timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 79, in _raise_error
    six.reraise(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 126, in wrapper
    self.future._raise_connection_error(exception)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/six.py", line 718, in reraise
    raise value.with_traceback(tb)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 91, in _raise_connection_error
    self._raise_error(BravadoConnectionError, 'ConnectionError', exception)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 79, in _raise_error
    six.reraise(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 291, in _get_incoming_response
    inner_response = self.future.result(timeout=timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/six.py", line 718, in reraise
    raise value.with_traceback(tb)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/requests_client.py", line 266, in result
    response = self.session.send(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 124, in wrapper
    return func(self, *args, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/http_future.py", line 291, in _get_incoming_response
    inner_response = self.future.result(timeout=timeout)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/bravado/requests_client.py", line 266, in result
    response = self.session.send(
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
bravado.http_future.RequestsFutureAdapterConnectionError
  File "/home/david/miniconda3/envs/kinoml/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
bravado.http_future.RequestsFutureAdapterConnectionError

May this also be an intended behavior from the KLIFS side to prevent it from being overloaded with requests from a single IP?

dominiquesydow commented 3 years ago

Thanks, @schallerdavid, could you please provide here a minimum example that reproduces this error? Happy to look into this.

dominiquesydow commented 3 years ago

Ah, I see now that you have done that in the first message, thanks! :)

dominiquesydow commented 3 years ago

I cannot figure out right now why multiprocessing.Process does not work with the opencadd.databases.klifs session; maybe because in your script the session is set up from within the process?

For kissim, I use multiprocessing as well but multiprocessing.Pool and I set up the sessions before parallelization and simply cast the session to each of the cores.

For your example it looks like this (the job is to subset structures by input kinase):

from itertools import repeat
import multiprocessing
import sys

from opencadd.databases.klifs import setup_remote

number_processes = int(sys.argv[1])

def run_job(kinase, klifs_session):
    structures = klifs_session.structures.all_structures()
    print(
        f"All structures: {len(structures)}\n"
        f"{kinase} structures: {len(structures[structures['kinase.klifs_name'] == kinase])}"
    )

if __name__ == "__main__":
    remote = setup_remote()
    kinases = ["EGFR", "BRAF", "KIT", "LOK"]

    pool = multiprocessing.Pool(processes=number_processes)
    pool.starmap(run_job, zip(kinases, repeat(remote)))
    pool.close()
    pool.join()

Output:

(opencadd) dominique@carbon: ~/Downloads
$ python klifs_mp.py 2
All structures: 12430
EGFR structures: 446
All structures: 12430
BRAF structures: 222
All structures: 12430
KIT structures: 43
All structures: 12430
LOK structures: 46

Would that work for you - or do you need to pass the process IDs explicitly as you did in your script?

schallerdavid commented 3 years ago

This looks like a great solution. Thanks @dominiquesydow