Closed amykglen closed 10 months ago
I suggest we look into this after we merge the issue-2114
branch into master
, in order to avoid merge conflicts. By "look into this", I suppose I mean at least as far as as assessing whether it would be easy to do. Priority level would, I presume, depend on the benefit (in terms of reduced chance of pickle file inconsistency) vs. expected time to implement. I'm not sure. I found that the logic in kp_info_cacher.py
was a bit tricky, but then again, the module is quite easy to test (I could even test it on my MBP).
This exception needs to be handled and turned into a TRAPI error message:
Traceback (most recent call last):
File "/mnt/data/python/Python-3.9.16/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/mnt/data/python/Python-3.9.16/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/ARAX_query.py", line 203, in asynchronous_query
self.query(query, mode=mode, origin='API')
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/ARAX_query.py", line 396, in query
result = self.execute_processing_plan(query, mode=mode)
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/ARAX_query.py", line 668, in execute_processing_plan
from ARAX_expander import ARAXExpander
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/ARAX_expander.py", line 31, in <module>
from Expand.trapi_querier import TRAPIQuerier
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/Expand/trapi_querier.py", line 35, in <module>
class TRAPIQuerier:
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/Expand/trapi_querier.py", line 38, in TRAPIQuerier
kp_selector: KPSelector = KPSelector(), force_local: bool = False):
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/Expand/kp_selector.py", line 28, in __init__
self.meta_map, self.kp_urls, self.kps_excluded_by_version, self.kps_excluded_by_maturity = self._load_cached_kp_info()
File "/mnt/data/orangeboard/test/RTX/code/UI/OpenAPI/python-flask-server/openapi_server/../../../../ARAX/ARAXQuery/Expand/kp_selector.py", line 39, in _load_cached_kp_info
smart_api_info, meta_map = kp_cacher.load_kp_info_caches(self.log)
File "/mnt/data/orangeboard/test/RTX/code/ARAX/ARAXQuery/Expand/kp_info_cacher.py", line 152, in load_kp_info_caches
raise e
File "/mnt/data/orangeboard/test/RTX/code/ARAX/ARAXQuery/Expand/kp_info_cacher.py", line 146, in load_kp_info_caches
raise Exception("KP info cache(s) do not exist.")
Exception: KP info cache(s) do not exist.
Agreed that we can't allow a crash. that will terminate the BackgroundTasker. I don't think we can do a TRAPI error message, because I think this is part of the BackgroundTasker process rather than a query process.
I have been musing about trying to come up with a mechanism that can alert us by email or similar when things are not working correctly.
@sundareswarpullela based on testing, I think your code changes for this issue have the side effect that the Flask __main__.py
module needs to pass run_kp_info_cacher=True
to the ARAXBackgroundTasker
initializer, otherwise there is an error in Expand (even though I think the KP info cache is not actually used in the RTX-KG2 KP service, is it?). This is not ideal, but for the time being, I just passed =True
. Long-term, I think we should probably not be generating the KP info cache for KG2 services if we don't have to be.
Error when the RTX-KG2 KP service is handling a query, and the background tasker was started with run_kp_info_cacher=False
:
Exception in thread Thread-5:
Traceback (most recent call last):
File "/mnt/data/python/Python-3.9.16/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/mnt/data/python/Python-3.9.16/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/ARAX_query.py", line 203, in asynchronous_query
self.query(query, mode=mode, origin='API')
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/ARAX_query.py", line 396, in query
result = self.execute_processing_plan(query, mode=mode)
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/ARAX_query.py", line 668, in execute_processing_plan
from ARAX_expander import ARAXExpander
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/ARAX_expander.py", line 31, in <module>
from Expand.trapi_querier import TRAPIQuerier
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/Expand/trapi_querier.py", line 35, in <module>
class TRAPIQuerier:
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/Expand/trapi_querier.py", line 38, in TRAPIQuerier
kp_selector: KPSelector = KPSelector(), force_local: bool = False):
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/Expand/kp_selector.py", line 28, in __init__
self.meta_map, self.kp_urls, self.kps_excluded_by_version, self.kps_excluded_by_maturity = self._load_cached_kp_info()
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server/KG2/openapi_server/../../../../../ARAX/ARAXQuery/Expand/kp_selector.py", line 39, in _load_cached_kp_info
smart_api_info, meta_map = kp_cacher.load_kp_info_caches(self.log)
File "/mnt/data/orangeboard/kg2test/RTX/code/ARAX/ARAXQuery/Expand/kp_info_cacher.py", line 152, in load_kp_info_caches
raise e
File "/mnt/data/orangeboard/kg2test/RTX/code/ARAX/ARAXQuery/Expand/kp_info_cacher.py", line 146, in load_kp_info_caches
raise Exception("KP info cache(s) do not exist.")
Exception: KP info cache(s) do not exist.
@sundareswarpullela and @amykglen I've fixed the bug, I think. See commit bcd4d62
Agreed that we can't allow a crash. that will terminate the BackgroundTasker. I don't think we can do a TRAPI error message, because I think this is part of the BackgroundTasker process rather than a query process.
I have been musing about trying to come up with a mechanism that can alert us by email or similar when things are not working correctly.
Eric, I think the referenced error is happening in ARAX Expand, though (if I'm not mistaken):
File "/mnt/data/orangeboard/kg2test/RTX/code/UI/OpenAPI/python-flask-server
/KG2/openapi_server/../../../../../ARAX/ARAXQuery/ARAX_expander.py", line 31, in <module>
in response to a query, so it is happening in the query controller child process. I think we can (and should) return a TRAPI error message, and zero results, in this case.
ah, I see, I misunderstood. The error is in kp_info_cacher, but that part of the code is only hit by Expander, not BackgroundTasker. I think I understand, thanks.
@sundareswarpullela: I have verified that with the latest code patches, /kg2test
works on arax.ncats.io
even if the KP info cache pickle file does not exist.
Thats Great! Thanks @saramsey
@sundareswarpullela just to be clear, we still need a fix for ARAX-Expand for the case where the KP info cache pickle file doesn't exist. Maybe you and I can talk about that tomorrow?
This has been pushed to prod. Closing issue.
Steve pointed out that having two caches that
kp_info_cacher.py
updates (meta KG and SmartAPI) notably increases the complexity of that script (in terms of multiprocessing), and there isn't really a benefit to having that info in two separate files anyway... we should probably combine them into onefrom Steve: