Open andrew-pickin-epi opened 4 years ago
The message is not that unreasonable. During load it is pulling the configuration (the DSD) from the Sparql endpoint, if that's not there it can't configure anything. Would be easy enough to catch that sort of exception and generate a different error message if that's worth it.
A retry would make sense and would be possible but would have to be within the DatasetMonitor
, can just leave it unconfigured until there is a query. Which means there would need to be some bounds to the retry. Can't remember if the underlying ConfigMonitor
is threaded or will block other config operations until it clears but certainly possible it'll block.
Is there any evidence for what a sensible retry limit is? As a default suggest a retry limit of 3 x the connection timeout.
In the instance in question a the worked had been flagged as off-line for 60s (apache default).
This is no longer the case.
In the event of potentially environmental issues sort as this the issue is not have many tries, as I'd suggest it should be indefinitely. The question is how frequently? I'd suggest once or twice a minute.
Ref: https://epimorphics.codebasehq.com/projects/operations/tickets/414
In the above trace ppd endpoint is not made available as the apache proxy fronting fuseki has marked the serve down and unavailable for 60s.
The log should make a distinction between the service being misconfigured, as this entry suggests. and the service being unavailble.
Secondly, as this is a potentially transient environment issue this should be subject to reties.