Closed Aklakan closed 1 week ago
This PR is a good step forward.
This PR is really two things:
The first is as little as move one line fix :-)
Having timeouts on update is good to have and the way it is done in this PR, timing out the WHERE clause of a modify DELETE-INSERT-WHERE
, seems the best way (I'd played around with update timeout ... but stalled on that) because it protects against long-running WHERE
.
There are various operations for updates and a general timeout on an update isn't a one change in one place. Update can be several operations all in one request so there could an overall timeout (caveat the issue of various operations).
The PR does impact on applications. This is not to say I think that it is bad (or good) idea but to draw out a consequence.
If I've read the PR with it comes down to this one line: pass the context to the WHERE evaluation. If a global timeout for queries is set, then that will be picked up by the update execution where it wasn't 5.2 and earlier.
For Fuseki, I would guess that timeout is usually per-endpoint; it is possible to set it database or server wide as well.
For an application using Jena as a library, an application wide timeout might be present and that would impact updates with this PR.
If a global timeout for queries is set, then that will be picked up by the update execution where it wasn't 5.2 and earlier.
ARQ.updateTimeout
and have the UpdateEngineWorker (or perhaps already at updateExecBuilder.build()
) docxt.set(ARQ.queryTimeout, cxt.get(ARQ.updateTimeout))
ARQ.requestType
context symbol for query and update requests (and perhaps dataset ones; c.f. LinkDataset
). And the appropriate timeout value is picked based on the context's request type which might be cleaner than having to mutate the context - i.e. no need to copy the updateTimeout
value to queryTimeout
.I'd say that the basic updateTimeout setting should affect the overall timeout of an update request - not per sub-update - so the remaining time would have to be passed on to the query exec. Hm, in this case the context would have to be mutated or copied.
The suggestions should be backward compatible with the existing behavior (but may require a change for computing the effective timeout in the query exec).
I have consolidated the timeout logic (initial and overall) into the Timeouts
class.
There is now ARQ.updateTimeout. The UpdateEngineWorker tracks its own elapsed processing time and updates its context with the remaining query time out before starting the underlying query execution.
The update timeouts are now only used to update ARQ.queryTimeout in the context. If no update timeout is set then the UpdateEngineWorker will unset the query timeout which should be consistent with the previous behavior.
Conversely, the update timeouts do not affect the sinks: In principle a sink could also refuse to accept changes once the timeout is reached - but if that's really needed it could be a future update.
I updated this PR's summary in the first post. The main changes are the introduction of ARQ.updateTimeout
and the additional cancel check based on Thread.interrupted()
in QueryIteratorBase.requestingCancel()
.
It's ready for another review.
I removed the two unused imports.
Thank you!
GitHub issue resolved #2821
Pull request Description:
Fixed inconsistent context handling in the update builder machinery and made it consistent with that in the query builder machinery.
The context of update requests is now forwarded to the query execution
UpdateEngineWorker
sets the remaining query timeout if the newly introducedARQ.updateTimeout
has been set. Note, that it is possible to set an initial timeout via theARQ.updateTimeout
from which internally anARQ.queryTimeout
setting is derived: An update request would abort if the corresponding query execution timed out on the initial timeout - so only updates with aWHERE
clause are affected.I did not add
initialTimeout
methods to the update exec builders because perhaps initial timeout on updates are a bit to specific for the general interface - opinions?Added
UpdateProcessor.getContext()
for consistency withQueryExec
. This is for discussion.Added
UpdateProcessor.abort()
which can abort the underlying query execution if it is based onQueryIterator
.Added test cases to the
TestUpdateExecutionCancel
class.For HTTP-based updates, the overall update timeout is forwarded to the existing HTTP machinery.
There is now a class
TestSPARQLProtocolTimeout
which tests setting the timeout on the update builder.QueryIteratorBase.requestingCancel
was updated to check theThread.interrupted()
so that when Fuseki gets shut down (e.g. during unit testing) then any threads still busy executing queries can cancel execution in a timely manner.Added query/update tests that register a local function that checks for the availability of the cancel signal. Please also see the test cases which register the function that checks for the cancel signal in three ways for both queries and updates:
By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.
See the Apache Jena "Contributing" guide.