fluree / db

Fluree database library
https://fluree.github.io/db/
Other
330 stars 21 forks source link

SHACL cannot constrain @id value #591

Open dpetran opened 9 months ago

dpetran commented 9 months ago

When creating a shacl shape to constrain the value of an @id, a new @id flake is created instead.

(let [conn        @(fluree/connect {:method :memory})
        ledger-name "foo"
        ledger      @(fluree/create conn ledger-name {:defaultContext [test-utils/default-str-context {"ex" "http://example.com/"}]})
        db0         (fluree/db ledger)

        db1         @(fluree/stage db0 {"type" "sh:NodeShape"
                                        "sh:targetClass" {"id" "ex:Person"}
                                        "sh:property" {"sh:path" {"id" "id"}
                                                       "sh:nodeKind" {"id" "sh:IRI"}}})]

    ;; valid
    @(fluree/stage db1 {"@id" "ex:Bob" "type" "ex:Person"})
    ;; invalid - no error
    @(fluree/stage db1 {"type" "ex:Person"}))

Instead of a proper shape, we end up with an internal representation that looks like this:

{:id "_:f211106232532993", :path [[1002 :predicate]], :node-kind 224}

Where the :path sid is 1002 (or some new sid) instead of 0.

Looking in the db novelty, this is the erroneous flake: #Flake [1002 0 "@id" 1 -1 true nil] - instead of creating a new flake it should just fall back to the base vocab identity flake #Flake [0 0 "@id" 1 0 true nil].

dpetran commented 9 months ago

It's looking like the index-range scan isn't picking up any matches for "@id":

;; brand new db
(-> db :novelty :post)
;;=>  
#{#Flake [0 0 "@id" 1 -1 true nil]
    #Flake [200 0 "@type" 1 -1 true nil]
    #Flake [203 0 "http://www.w3.org/2000/01/rdf-schema#Class" 1 -1 true nil]}

;; but index-range can't find it
  (async/<!! (query-range/index-range db :post = [const/$xsd:anyURI "@id"]))
  []
bplatz commented 9 months ago

Should this be allowed? What sort of validation would want to be done here? I am not sure this SHACL would be compatible with systems outside Fluree, as @id doesn't exist outside of JSON-LD. Can't recall seeing a way to target a subject IRI in SHACL itselr, but I think the bigger question is what would this be used for as we could choose to support it regardless.

dpetran commented 9 months ago

This question popped up on Discord where a user wanted to constrain @id to only be IRIs, not blank node ids, using the sh:nodeKind constraint. You're right that this construction may not work outside of Fluree - it relies on the existence of an "id flake": [<sid> 0 <sid> 1 <t> true nil].

In any case, I do think there is a bug under here in that our index-range call isn't picking up the id flake correctly.

So we should make a decision about whether to support the use case, but I do think we should investigate the index-range behavior.

JaceRockman commented 3 months ago

@dpetran I tried to check if this is still an issue using this code:

(let [conn        @(fluree/connect {:method :memory})
      ledger-name "foo"
      ledger      @(fluree/create conn ledger-name)
      db0         (fluree/db ledger)
      db1         @(fluree/stage db0 {"@context" context
                                      "insert" [{"@type" "sh:NodeShape"
                                                 "sh:targetClass" {"@id" "ex:Person"}
                                                 "sh:property" {"sh:path" {"@id" "id"}
                                                                "sh:nodeKind" {"@id" "sh:IRI"}}}]})]
  @(fluree/stage db1 {"context" context
                      "insert" {"@id" "ex:Bob" "type" "ex:Person"}})
  @(fluree/stage db1 {"@context" context
                      "insert" {"type" "ex:Person"}})
  @(fluree/query db1 {"@context" {"schema" "http://schema.org/"}
                      :where     '{"@id"         ?s
                                   "sh:path" "?path"}
                      :select    '{?s ["*"]}}))

which resulted in: [{@id _:fdb-1710279741715-4ceM2b6R, sh:nodeKind {@id sh:IRI}, sh:path {@id id}}]

Is this what we would expect of these results and is the code that I wrote to test it correct?

dpetran commented 3 months ago

In order for a practical test I'd change the shape to check for an actual blank node:

{"@type" "sh:NodeShape"
   "sh:targetClass" {"@id" "ex:Person"}
   "sh:not" {"sh:path" {"@id" "id"}
                    "sh:nodeKind" {"@id" "sh:BlankNode"}}}

I'm still not sure this can even work, especially now that we don't have id flakes.

JaceRockman commented 3 months ago

I get the same result when I use that NodeShape is that to be expected?

dpetran commented 3 months ago

The expected result is that the stage without an explicit @id would fail with a validation error, and it looks like it didn't. On the other hand, I don't think you can use @id as a path, so I'm not sure what the true expectation of this is.

JaceRockman commented 3 months ago

@dpetran Should I go ahead and close this issue then?