NCATSTranslator / ReasonerAPI

NCATS Biomedical Translator Reasoners Standard API
35 stars 28 forks source link

Proposed bypass_cache directive in Query #473

Closed edeutsch closed 8 months ago

cbizon commented 9 months ago

Just following up from the 2/13 Architecture call - is there a definition of how much caching needs to be bypassed to comply? Just to muddy the waters with some examples -

ARAGORN caches KP edges, and then on top of that caches full responses. In a bypass-cache scenario is the goal to bypass both kinds?

Unsecret "caches" in the sense that they pre-build a KG based on large inputs that they periodically receive. Since there is no way to rebuild that on the fly, any downstream caching doesn't really matter since the source data won't change.

I think ARAX pre-caches MVP1 results in a way that can't really be regenerated on the fly, how should it respond? (Please correct if wrong)

So I guess maybe it's going to be bit fuzzy, but is the idea something like "bypass all caching as much as possible?"

edeutsch commented 9 months ago

I agree that this is worth documenting carefully. You are right that for the MVP1 query, ARAX consults some pre-computed results from a ML model (among other sources) that cannot be done on the fly.

So I agree that it would be a bit fuzzy, I would state it as "bypass as much caching as is feasible"

How about this for a succinct statement:

When a client provides the bypass_cache=true option to an agent, the agent MUST request fresh information from KPs in all cases where it has a viable choice between requesting fresh data and using cached information.

here "viable choice" is the finessed phrase. I suppose I would define this as: does the agent have code in place that could either request fresh data or used cached data. If there is no code in place to request fresh data in real time, then there is no viable choice.

comments?

edeutsch commented 9 months ago

I have refined the definition of bypass_cache based on conversation to:

        Set to true in order to request that the agent obtain
        fresh information from its sources in all cases where
        it has a viable choice between requesting fresh information
        in real time and using cached information.

Please approve, or suggest further refinements, or comment with problems you have with this, or at least be prepared to come to a resolution at tomorrow's TRAPI call.

cbizon commented 9 months ago

I like this refinement FWIW