reconciliation-api / specs

Specifications of the reconciliation API
https://reconciliation-api.github.io/specs/draft/
30 stars 9 forks source link

Allow more flexible filter constraint handling of property arrays #88

Closed thadguidry closed 7 months ago

thadguidry commented 2 years ago

In order to fully accommodate features and provide better accuracy computing score to clients against a set of optional or required filter constraints of properties that a user is requesting.

We should allow multiple property arrays and within each array the combining behaviors that were typical in the past (Freebase) with recon services (See the old Freebase advanced filtering and search cookbook ) )

Those would also avoid the need for Union types, since we could construct something like (or P104:"box" P104:"ball")

And individual properties could be filtered similarly with suffixes as in Prisma backend with GraphQL such as:

_gt (greater than)
_lt (less than)
_gte (greater than or equal to)
_lte (less than or equal to)
_in (equal to)
_not_in (not equal to)

Example in GraphQL I.E. return all persons' details only if their age is greater than 18:

query {
  persons(where: {age_gt: 18}) {
    firstName
    lastName
    age
  }
}

In the OpenRefine reconciliation service issue # https://github.com/wetneb/openrefine-wikibase/issues/141#issuecomment-1170311259 where I commented that currently it seems it is impossible to use the current reconciliation api standard to form a filtering query such as this example:

?query="William Albert Ablett"&filter=(any P735:"Albert" P735:"John" (and P734:"Ablett"))"

QUESTION 1.

In the current spec, we do not mention the cardinality of properties, but it seems we implied that each query object can only have 1 properties array?

If using the current spec, there is no way to do so with a single query while also using the properties[] and type_strict, since supplying multiple properties[] breaks the implicit relationship of type_strict. (which type_strict attaches to which properties[] array?) We certainly could move type_strict inside of properties[] but this is a major change to the API and we have additional questions similarly raised in issue #60

QUESTION 2.

So then It seems we would need at least 2 queries, which defeats the extra narrowed filtering that could be accomplished in a single request?

match entities named "William Albert Ablett" that have either "Albert" or "John" as a first name property and "Ablett" as a last name property

{
  "q0": {
    "query": "William Albert Ablett",
    "type": "Q5",
    "limit": 5,
    "properties": [
      {
        "pid": "firstName",
        "v": ["Albert","John"]
      }
    ],
    "type_strict": "any",
    "properties": [
      {
        "pid": "lastName",
        "v": "Albert"
      }
    ],
    "type_strict": "and"
 }
}

QUESTION 3.

If desired by some devs, how could we make it easier to allow some form of Union types, such as in GraphQL, and what might that best look like? https://graphql.org/learn/schema/#union-types

wetneb commented 2 years ago

I am not sure what you mean with this filtering. As far as I know this is not supported by the specs. Maybe some services support it (which ones?) but I think this has not been specified here yet.