IPIF / prosopogrAPhI

Tentative way towards a shared API for prosopographical data based on the factoid model (Bradley/Short 2005)
24 stars 6 forks source link

Statement params wildcard #26

Open richardhadden opened 3 years ago

richardhadden commented 3 years ago

Given wide range of vocabularies in use, it would be useful to select statements based on a non-empty value for specific fields, e.g. using the presence of any value for the ?name parameter to select "naming" statements.

The in-development Python client (https://gitlab.com/acdh-oeaw/ipif-client-python) works around this by fetching each statement with a separate request, but this is clearly inefficient.

I propose using a 'wildcard' value to match any non-empty value, e.g. /statements/?name=*. (An alternative is to introduce a mustNotBeEmpty parameter that takes a list required parameters, though this is clearly more complicated)

If the intention (as yet not fully described in the IPIF spec) for full-text matching is to adhere to Lucene syntax (https://lucene.apache.org/core/2_9_4/queryparsersyntax.html), an asterisk is not the best character (a Lucene query cannot start with an asterisk).

The canonical way to do a non-empty search in Lucene is fieldName:[* TO *], but [* TO *] seems arcane as a URI parameter. Are there any single, suitable characters that will not break a URL or Lucene?

(One possibility is to 'repurpose' the single asterisk, which is not a legitimate Lucene syntax anyway, and have each endpoint translate that into whatever query would be required for non-empty values)

GVogeler commented 3 years ago

closed with #27