fluree / core

Fluree releases and public bug reports
0 stars 0 forks source link

`insert + where` pattern should allow net-new insertions #21

Closed aaj3f closed 8 months ago

aaj3f commented 10 months ago

Description

[UPDATED 10/17 to reflect newest decisions]

Transaction syntax should be exclusively managed via a combination of insert + delete + where clauses.

insert takes JSON-LD-shaped data and exclusively asserts the facts described within that JSON-LD payload

delete takes JSON-LD-shaped data and exclusively deletes the facts described within that JSON-LD payload

where allows for bindings of current db state against triple-pattern statements to allow data to be inserted/deleted programmatically without having to specify the values explicitly. Use of where is not required with insert and/or delete clauses, however it allows for much more powerful transactions by leveraging a multiplicity of values that might get bound to a single ?binding in a where statement.

For example, this first delete txn would only delete the property-value of ex:name on ex:andrew if that fact currently === [ex:andrew, ex:name, "Andrew"]

{
   "delete": {
      "@id": "ex:andrew",
      "ex:name": "Andrew"
   }
}

Whereas the following use of where + delete would delete any value from ex:name that exists on ex:andrew:

{
   "where": [["ex:andrew", "ex:name", "?name"]],
   "delete": {
      "@id": "ex:andrew",
      "ex:name": "?name"
   }
}

Similarly, the following insert txn will only add the value ex:fluree to ex:employedBy on the two subjects explicitly listed

{
   "insert": [
      { "@id": "ex:andrew", "ex:employedBy": { "@id": "ex:fluree" } },
      { "@id": "ex:dan", "ex:employedBy": { "@id": "ex:fluree" } },
   ]
}

Whereas the following use of where + insert would expand to add "ex:employedBy": { "@id": "ex:fluree" } on every entity bound by ?flureeEmployee

{
   "where": [["ex:fluree", "ex:employees", "?flureeEmployee"]],
   "insert": {
      "@id": "?flureeEmployee",
      "ex:employedBy": { "@id": "ex:fluree" }
   }
}

"Updates" of data, where you don't want to explicitly assert/insert or delete but want to take existing entities and update values if values exist would use a combination of insert + delete + where. For example...

We can think of a few scenarios where I want to say that ex:andrew has schema:name of "Andrew"

  1. I want to insert this fact regardless of whether ex:andrew has a value on schema:name
  2. I want to insert this fact, but only if ex:andrew has the value "Baby Andrew, Mother's Precious Child" on schema:name
  3. I want to insert this fact, but I want it to replace whatever value ex:andrew has on schema:name, regardless of value

I want to insert this fact regardless of whether ex:andrew has a value on schema:name

{
   "insert": {
      "@id": "ex:andrew",
      "schema:name": "Andrew"
   }
}

I want to insert this fact, but only if ex:andrew has the value "Baby Andrew, Mother's Precious Child" on schema:name

{
   "where": [
      ["?s", "@id", "ex:andrew"],
      ["?s", "schema:name", "Baby Andrew, Mother's Precious Child"]
   ],
   "delete": {
      "@id": "?s",
      "schema:name": "Baby Andrew, Mother's Precious Child"
   },
   "insert": {
      "@id": "?s",
      "schema:name": "Andrew"
   }
}

I want to insert this fact, but I want it to replace whatever value ex:andrew has on schema:name, regardless of value

{
   "where": [
      ["ex:andrew", "schema:name", "?o"]
   ],
   "delete": {
      "@id": "ex:andrew",
      "schema:name": "?o"
   },
   "insert": {
      "@id": "ex:andrew",
      "schema:name": "Andrew"
   }
}

Lastly, we want to guarantee that we can not only bind object values (and reuse them in insert/delete clauses), but that the same is true of property IRIs as well. For example, we may want to delete ALL facts on ex:andrew before asserting several new facts. However, we don't currently know which facts exist on which properties for ex:andrew. That might look like

{
   "where": [["ex:andrew", "?p", "?o"]],
   "delete": {
      "@id": "ex:andrew",
      "?p": "?o"
   },
   "insert": {
      "@id": "ex:andrew",
      "schema:name": "Andrew"
   }
}

~insert + where is the ideal pattern for inserting new values on properties without retracting previous values. For example if ex:andrew already had two object referent values on ex:follows (e.g. Andrew follows two other people currently), and we want to add a new value to ex:andrew on ex:follows, if we use the following syntax it will retract the current two facts on [ex:andrew, ex:follows, ...] and replace with the new value(s)~

~{ "@id": "ex:andrew", "ex:follows": [{ "@id": "ex:marcela", "schema:name": "Marcela" }] }~

~That's fine. The pattern insert + where allows for facts to be added without retracting existing facts.~

~However, the current implementation of insert + where only allows the use of node IRIs for nodes already existing in the db. So I could do this, but only if a node for ex:marcela already existed:~

~The desired work here would be to allow the following:~

~The outcome here would be to add s-p-o triples for ex:marcela while also adding the [ex:andrew, ex:follows, ex:marcela] triple to create an edge between those two nodes.~

~Note: This also means we should be able to use insert without where or delete and also insert + where without delete (or just where and delete without insert)~

Note: for additional context, see this internal slack conversation

aaj3f commented 10 months ago

In icebox pending further discussion about the allowance of JSON-LD-like elements within insert & where statements