Open pchampin opened 7 months ago
Hi @pchampin 👋
The functionality you are describing is available in this actor: https://github.com/comunica/comunica-feature-link-traversal/tree/master/packages/actor-rdf-resolve-hypermedia-links-traverse-annotate-source-graph
It's not part of the default configuration, but a separate one, which has a corresponding web client here: https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/solid-prov-sources/
We haven't done any experiments with it so far, so we don't know at the moment how much overhead the implementation causes.
There may also be some alternative approaches possible to achieve triple provenance, such as the quoted triples from RDF-star. (this has been on hold for a while, but now that Comunica supports RDF-star, we could theoretically start building such an implementation)
Great, thanks @rubensworks .
Is there a way to use the command-line tool with this specific configuration file ? (I tried the -c flag, but it does not seem to work...).
Is there a way to use the command-line tool with this specific configuration file ? (I tried the -c flag, but it does not seem to work...).
That should be possibly using the dynamic variant of the CLI tool (I suspect comunica-dynamic-sparql-link-traversal-solid
in your case) and setting the COMUNICA_CONFIG
envir variable.
Thanks again @rubensworks but I had no luck with the config file. Below is the command line I used:
COMUNICA_CONFIG=config-solid-prov-sources.json \
my-comunica \
"PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?name ?g { <https://champin.net/#pa> foaf:knows ?p. GRAPH ?g { ?p foaf:name ?name } }" \
--lenient \
-l debug 2>/tmp/comunica-log
Note that my-comunica
is an alias to comunica-dynamic-sparql-link-traversal-solid
.
I get no result. While when I remove the GRAPH ?g
clause around the 2nd triple, I do get results. So the triples are retrieved, but not put in named graphs as I was expecting...
I tested it with the version installed from NPM (2.10.1) or with the version built from the master branch ( bb3fa62).
@pchampin Could you try again with the flag --unionDefaultGraph
?
It seems to be working here with this query: https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/solid-prov-sources/#transientDatasources=https%3A%2F%2Fwww.rubensworks.net%2F&query=SELECT%20DISTINCT%20*%20WHERE%20%7B%0A%20%20%20%20GRAPH%20%3Fsource%20%7B%0A%20%20%20%20%20%20%3Fperson%20foaf%3Aname%20%3Fname.%0A%09%7D%0A%7D However, it looks like some results have an empty graph binding, so the implementation probably has some issues still. (it's quite old, so things may have broken with more recent changes)
I did try with --unionDefaultGraph
already, and yes, it provides results, but for the wrong reason... In fact, even with the default configuration AND the --unionDefaultGraph
option, I get exactly the same result (with an empty IRI bound to ?g
).
My understanding is that, when --unionDefaultGraph
is on, the default graph is a read-only view, so simple triples (as opposed to quads) are added in the graph named <>
(empty IRI). If anything, the results we get when turning on this option shows that the 'annotate-source-graph' actor fails to add the triples in the right named graph...
Ok, thanks for checking. So something is definitely going wrong in the 'annotate-source-graph' actor then...
Issue type:
Description:
Currently, there is no way to know from which source the link traversal retrieved a given triple. I would like, for example, to be able to ask the following query:
to determine whether the name of a person comes from their own profile or another source.
Of course, I would expect the default graph to be, by default, the merge of all named graphs, so that "flat" queries still work as expected.
cc @lecoqlibre @FabienGandon