dotnetrdf / dotnetrdf

dotNetRDF is a powerful and flexible API for working with RDF and SPARQL in .NET environments
http://dotnetrdf.org/
Other
288 stars 90 forks source link

Can't execute a federated INSERT against an in-memory dataset #641

Closed rdstn closed 4 months ago

rdstn commented 5 months ago

Hello,

I'm trying to execute a federated SPARQL INSERT against an in-memory dataset. The idea is to carry out some post-processing and then feed the model, including quads to a different service. Since I need context data, I can't go with CONSTRUCT, otherwise I would have worked around this.

The SPARQL query can be something very simple: query = "INSERT {?s ?p ?o} WHERE { SERVICE <http://localhost:7200/repositories/test> { ?s ?p ?o }}";

This one doesn't have graph data, so, strictly speaking, I can do CONSTRUCT, and then parse the response, but it's just an example, not the actual query I need.

And here's how it's getting invoked:

        TripleStore store = new TripleStore();
        InMemoryQuadDataset ds = new InMemoryQuadDataset(store);
        LeviathanUpdateProcessor processor = new LeviathanUpdateProcessor(store);
        var query = "INSERT {?s ?p ?o} WHERE { SERVICE <http://localhost:7200/repositories/test> { ?s ?p ?o }}";
        SparqlUpdateParser sparqlparser = new SparqlUpdateParser();
        SparqlUpdateCommandSet update = sparqlparser.ParseFromString(query);
        processor.ProcessCommandSet(update);

I get treated to an exception which suggests that the error may be in the remote service:

Unhandled exception. VDS.RDF.Query.RdfQueryException: Query execution failed because evaluating a SERVICE clause failed - this may be due to an error with the remote service
 ---> System.AggregateException: One or more errors occurred. (A task was canceled.)
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at System.Threading.Tasks.Task.GetExceptions(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at VDS.RDF.Query.LeviathanQueryProcessor.ProcessService(Service service, SparqlEvaluationContext context)
   at VDS.RDF.Query.Algebra.Service.Accept[TResult,TContext](ISparqlQueryAlgebraProcessor`2 processor, TContext context)
   at VDS.RDF.Query.LeviathanQueryProcessor.ProcessAlgebra(ISparqlAlgebra algebra, SparqlEvaluationContext context)
   at VDS.RDF.Query.SparqlEvaluationContext.Evaluate(ISparqlAlgebra algebra)
   at VDS.RDF.Update.LeviathanUpdateProcessor.ProcessInsertCommandInternal(InsertCommand cmd, SparqlUpdateEvaluationContext context)
   at VDS.RDF.Update.LeviathanUpdateProcessor.ProcessCommandInternal(SparqlUpdateCommand cmd, SparqlUpdateEvaluationContext context, Boolean autoCommit)
   at VDS.RDF.Update.LeviathanUpdateProcessor.ProcessCommandSet(SparqlUpdateCommandSet commands)
   at QueryProcessorExample.Main() in E:\Programs\IDEs\workspace\temp\UKP\ConsoleApp1\ConsoleApp1\Program.cs:line 82
--- End of stack trace from previous location ---

However, I've checked and no traffic happens between the test program and the remote database.

Is it possible to execute a federated insert against an in-memory dataset? The documentation doesn't mention it explicitly and I've seen no tests which cover it. If not, what would you recommend as a workaround?

kal commented 4 months ago

Does the same problem occur if you perform a SELECT query rather than an INSERT update or does it only happen on an update? The nature of the stack trace indicates that the problem is in the processing of the SERVICE and not in the INSERT itself.

Also do you see a noticeable delay before the query fails? i.e. is it possible that a timeout limit is being hit?

rdstn commented 4 months ago

Hello. SELECT works OK, and so does standard INSERT or CONSTRUCT queries. It's using SERVICE within the INSERT which causes the issue.

I had the idea to try a nested query:

INSERT {
    ?s ?p ?o 
} WHERE {
    SELECT ?s ?p ?o {
        SERVICE <http://localhost:7200/repositories/test> {
            ?s ?p ?o 
        }
    }
}

but it throws the same exception.

The exception is thrown immediately, there is no timeout from what I can see. I've also gone and set the timeout from the options to 60s. The exception comes up as soon as the code reaches the update evaluation step.

kal commented 4 months ago

Thanks for the extra information and the investigation. If its failing immediately then it seems most likely that there is an unexpected exception while setting up the service call, that's really helpful information for narrowing down the cause!