zazuko / trifid

Lightweight Linked Data Server and Proxy
Apache License 2.0
81 stars 11 forks source link

"Number of results per named graph" is inaccurate #346

Open tpluscode opened 7 months ago

tpluscode commented 7 months ago

Running v5.0.2 and oxigraph I observed two surprising behaviors when opening the default entity page.

I start oxigraph with the union-default-graph. I have a single resource is one graph. 35 triples.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix hydra: <http://www.w3.org/ns/hydra/core#> .
@prefix doap: <http://usefulinc.com/ns/doap#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .

graph <https://trippin.lndo.site/shape/project> {
   <https://trippin.lndo.site/shape/project> a sh:NodeShape ;
    sh:property [
        sh:order 3 ;
        sh:name "Start date" ;
        sh:path doap:created ;
        sh:datatype xsd:date ;
    ], [
        sh:order 4 ;
        sh:name "End date" ;
        sh:path doap:end ;
        sh:datatype xsd:date ;
    ], [
        sh:order 2 ;
        sh:name "Project description" ;
        sh:path doap:description ;
        sh:datatype xsd:string ;
    ], [
        sh:order 1 ;
        sh:name "Project name" ;
        sh:path doap:name ;
        sh:datatype xsd:string ;
        sh:minCount 1 ;
    ], [
        sh:order 5 ;
        sh:name "Responsible person" ;
        sh:path schema:accountablePerson ;
        sh:class schema:Person ;
        sh:nodeKind sh:IRI ;
        sh:in (
            <http://example.com/Employee1>
            <http://example.com/Employee2>
        ) ;
    ] ;
    hydra:apiDocumentation <https://trippin.lndo.site/api> ;
    sh:targetClass doap:Project .
}

When I open https://trippin.lndo.site/shape/project, the "Number of results per named graph" table looks like this

Graph name Number of results
Default graph 8

I understand that this is the number of triples with that subject? I would also include the blank node subtree which is otherwise unreachable from other graphs, but alas.

The real question is, why is it saying "default graph"? Those triples are in a named graph.

If I remove ?union-default-graph from the endpoint, dereferencing <https://trippin.lndo.site/shape/project> returns 404. I suppose that too is expected. Is it possible to change the default resolver to return all triples from named graph same as requested URL?

ludovicm67 commented 7 months ago

Right now, the file-handler that generates a SPARQL endpoint on the fly from a triple file is using Oxigraph.

Right now, there is no equivalent for a CBD describe strategy link in Stardog, that's why blank nodes are missing. This would be integrated in Oxigraph in the future.

Regarding the default graph, right now the behavior is a bit strange, but it was done in order to be able to load the triples and to be able to query them. I asked some time ago for improvements in that part in Oxigraph, and it seems that things happened in between. I will see if I can do something to improve the thing here in Trifid ;)

Also, would it be possible to get a very simple and basic repro (config file, command used to spawn the instance, …) in order to save some previous time and to investigate further?

ludovicm67 commented 7 months ago

This will help: https://github.com/oxigraph/oxigraph/pull/849

ludovicm67 commented 5 months ago

The new Trifid version (v5.0.4) includes some improvements for the included Oxigraph instance.

If you want to give it a try, you can start the Docker image with the following environment variables:

environment:
  - TRIFID_CONFIG=instances/docker-fetch/config.yaml
  - FETCH_HANDLER_FILE="https://raw.githubusercontent.com/zazuko/tbbt-ld/master/dist/tbbt.nt" # default value
  - FETCH_HANDLER_FILE_TYPE="application/n-triples" # default value
ludovicm67 commented 5 months ago

I converted the TTL file into this triples file (data.nt):

<https://trippin.lndo.site/shape/project> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/shacl#NodeShape> .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/hydra/core#apiDocumentation> <https://trippin.lndo.site/api> .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#targetClass> <http://usefulinc.com/ns/doap#Project> .

<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#property> _:b0 .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#property> _:b1 .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#property> _:b2 .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#property> _:b3 .
<https://trippin.lndo.site/shape/project> <http://www.w3.org/ns/shacl#property> _:b4 .

_:b0 <http://www.w3.org/ns/shacl#order> "3"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b0 <http://www.w3.org/ns/shacl#name> "Start date" .
_:b0 <http://www.w3.org/ns/shacl#path> <http://usefulinc.com/ns/doap#created> .
_:b0 <http://www.w3.org/ns/shacl#datatype> <http://www.w3.org/2001/XMLSchema#date> .

_:b1 <http://www.w3.org/ns/shacl#order> "4"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b1 <http://www.w3.org/ns/shacl#name> "End date" .
_:b1 <http://www.w3.org/ns/shacl#path> <http://usefulinc.com/ns/doap#end> .
_:b1 <http://www.w3.org/ns/shacl#datatype> <http://www.w3.org/2001/XMLSchema#date> .

_:b2 <http://www.w3.org/ns/shacl#order> "2"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b2 <http://www.w3.org/ns/shacl#name> "Project description" .
_:b2 <http://www.w3.org/ns/shacl#path> <http://usefulinc.com/ns/doap#description> .
_:b2 <http://www.w3.org/ns/shacl#datatype> <http://www.w3.org/2001/XMLSchema#string> .

_:b3 <http://www.w3.org/ns/shacl#order> "1"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b3 <http://www.w3.org/ns/shacl#name> "Project name" .
_:b3 <http://www.w3.org/ns/shacl#path> <http://usefulinc.com/ns/doap#name> .
_:b3 <http://www.w3.org/ns/shacl#datatype> <http://www.w3.org/2001/XMLSchema#string> .
_:b3 <http://www.w3.org/ns/shacl#minCount> "1"^^<http://www.w3.org/2001/XMLSchema#integer> .

_:b4 <http://www.w3.org/ns/shacl#order> "5"^^<http://www.w3.org/2001/XMLSchema#integer> .
_:b4 <http://www.w3.org/ns/shacl#name> "Responsible person" .
_:b4 <http://www.w3.org/ns/shacl#path> <http://schema.org/accountablePerson> .
_:b4 <http://www.w3.org/ns/shacl#class> <http://schema.org/Person> .
_:b4 <http://www.w3.org/ns/shacl#nodeKind> <http://www.w3.org/ns/shacl#IRI> .
_:b4 <http://www.w3.org/ns/shacl#in> _:b5 .

_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example.com/Employee1> .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b6 .

_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://example.com/Employee2> .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .

And I created a Docker Compose stack:

services:
  trifid:
    image: ghcr.io/zazuko/trifid:v5.0.4
    ports:
      - "8080:8080"
    volumes:
      - ./data.nt:/data/data.nt:ro
    environment:
      - TRIFID_CONFIG=instances/docker-fetch/config.yaml
      - FETCH_HANDLER_FILE=/data/data.nt
      - FETCH_HANDLER_TYPE=application/n-triples
      - DATASET_BASE_URL=https://trippin.lndo.site/

Start it using:

docker compose up

And when I go to http://0.0.0.0:8080/shape/project, I get this:

image

So the number of triples shown seems to be fixed.

The current configuration of the fetch handler plugin is not able to support quads ; everything is done on the default graph.

ludovicm67 commented 3 months ago

I spend some more time on investigating this. The issue seems to be related to https://github.com/oxigraph/oxigraph/issues/960

The handler-fetch plugin is doing a DESCRIBE query, and expects a n-quads response. For now, Oxigraph is returning only quads with the value of the default graph instead of the real graph name.