reconciliation-api / specs

Specifications of the reconciliation API
https://reconciliation-api.github.io/specs/draft/
31 stars 9 forks source link

Move 'query' field inside 'properties' #149

Closed wetneb closed 6 months ago

wetneb commented 9 months ago

Closes #134. Closes #106.

wetneb commented 8 months ago

@tfmorris in our call this week, we thought that we could accommodate your preference for an explicit pid, because we have an eye on https://github.com/reconciliation-api/specs/issues/145 as well, for which we would likely need a similar solution (and we would not be able to use an empty pid for both, obviously).

Also, I want to stress that the consensus we reach in the monthly calls are not (in my opinion) meant to be definitive resolutions of the entire community group, but rather ways to quickly run ideas by other group members and get their feedback early on (saving some rounds of review on GitHub). So there is no problem with questioning those outcomes. That's why we still make PRs for those things.

osma commented 8 months ago

I think it was me who suggested an empty pid, in the spirit of "what is the simplest thing that could possibly work?". I have no problem admitting that this is hackish and ugly.

we thought that we could accommodate your preference for an explicit pid, because we have an eye on https://github.com/reconciliation-api/specs/issues/145 as well, for which we would likely need a similar solution (and we would not be able to use an empty pid for both, obviously).

It seems to me that the scope of properties is expanding, perhaps way beyond its original intent, if we are going to apply it not just to labels but to types as well. How about renaming it into something more generic such as conditions (or where) and adding a new attribute match_type to specify whether we are matching labels, properties or types? It could look something like this:

      "conditions": [
        {
          "match_type": "label",
          "v": "Christel Hanewinckel"
        },
        {
          "match_type": "property",
          "pid": "professionOrOccupation",
          "v": "Politik*",
          "required": false,
          "match_quantifier": "any",
          "match_qualifier": "WildcardMatch"
        },
        {
          "match_type": "property",
          "pid": "affiliation",
          "v": "http://d-nb.info/gnd/2022139-3",
          "required": false,
          "match_quantifier": "any",
          "match_qualifier": "ExactMatch"
        }
      ]

To handle types as well (#145), we could add "match_type": "type" as well.

fsteeg commented 7 months ago

To handle types as well (https://github.com/reconciliation-api/specs/issues/145), we could add "match_type": "type" as well.

So to summarize, we'd have 3 different kinds of match_type for these proposed conditions: label (replacing the old query, no pid required), type (replacing the old type, no pid required), and property (the original case, pid required).

For compatibility, we could also keep the properties name, define "match_type": "property" as the default, and effectively only add a new optional field match_type? And for the possible values we could stick with the existing query, type, property as values.

It seems like a sensible way to structure this. In general, simply using label and type as pid and not adding a new field seems simpler. However, with a match_type we'd avoid possible collisions with existing label or type fields (and workarounds like _label / _type), right? And maybe it's actually cleaner to have a separate field, instead of giving certain pid values a special meaning?

wetneb commented 7 months ago

That sounds good to me, but I would suggest "match_type": "name" instead of "match_type": "label" since that's currently how we call this field in the specs (and JSON serialization of reconciliation candidates, for instance).

osma commented 7 months ago

"match_type": "name" seconded!

osma commented 6 months ago

Should the changes made in this PR also be mentioned in the changes section 1.4.3 This Draft?

wetneb commented 6 months ago

Yes, we could add update this section with all of the recent changes we have done (not just this one).