TranslatorSRI / NameResolution

A service for finding CURIEs from lexical strings.
3 stars 2 forks source link

Slow performance with short words in search query #107

Open EvanDietzMorris opened 8 months ago

EvanDietzMorris commented 8 months ago

Still working on a more comprehensive list but these are a couple examples that were really slow: Collapsin response mediator protein 2 Apolipoprotein A-I binding protein (AIBP)

gaurav commented 8 months ago

I bet this is related: https://github.com/TranslatorSRI/NameResolution/issues/95

YaphetKG commented 1 month ago

Have noticed a couple of connection reset error from running synonymizing of a bunch of terms...


...  File "/home/airflow/.local/lib/python3.11/site-packages/dug/core/annotators/sapbert_annotator.py", line 76, in __call__
    norm_id.synonyms = self.synonym_finder(norm_id.id, http_session)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/dug/core/annotators/_base.py", line 197, in __call__
    response = self.make_request(curie, http_session)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/dug/core/annotators/_base.py", line 207, in make_request
    response = http_session.post(url, json=payload)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests_cache/session.py", line 137, in post
    return self.request('POST', url, data=data, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests_cache/session.py", line 182, in request
    return super().request(method, url, *args, headers=headers, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests_cache/session.py", line 229, in send
    response = self._send_and_cache(request, actions, cached_response, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests_cache/session.py", line 253, in _send_and_cache
    response = super().send(request, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/requests/adapters.py", line 501, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
[2024-05-26, 14:48:22 EDT] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=annotate_and_index, task_id=parent-dbgap_dataset_pipeline_task_group.annotate_parent-dbgap_files, execution_date=20240513T231302, start_date=20240526T030001, end_date=20240526T184822 ```

this is the error i was seeing , happened around 2024-05-26, 14:48:22 EDT, but looking into loki logs for Name res (synonymizer_url: http://name-resolution-name-lookup-web-svc.translator-dev:2433/reverse_lookup) but i wasn't able to find anything useful . 
gaurav commented 1 month ago

@YaphetKG That's probably not related to this issue, which is caused by sending /lookup a search phrase containing a small word (e.g. A or 17). Your issue appears to be caused by using the /reverse_lookup endpoint, which should be a very quick lookup operation on Solr. I am seeing some CPU throttling going on on the web frontend -- maybe that's what's causing your issue? I've increased the memory and CPU available to the NameRes Dev fontend: https://github.com/helxplatform/translator-devops/pull/909

gaurav commented 1 week ago

I'm trying to poke on CPU/memory to see if I can fix the issue that way (spoiler: doesn't look like it): https://github.com/TranslatorSRI/NameResolution/issues/152