Conal-Tuohy / PROV-Solr-API-Tools

Provides additional tools for working with PROV's Solr API
Apache License 2.0
1 stars 0 forks source link

date range #15

Closed asaletourneau closed 2 years ago

asaletourneau commented 2 years ago

Whilst an api query for dawn fraser for the date range 30/11/1956 - 01121956 doesn't return a result i.e. Query URL: http://prov.conaltuohy.com/search/query?wt=xml&q=(text%3A(dawn%20fraser))%20AND%20(start_dt%3A(1956-11-30))%20AND%20(end_dt%3A(1956-12-01))

the api query for dawn fraser for the date range 30/11/1956 - 30/11/1956 does i.e http://prov.conaltuohy.com/search/query?wt=xml&q=(text%3A(dawn%20fraser))%20AND%20(start_dt%3A(1956-11-30))%20AND%20(end_dt%3A(1956-11-30))

The actual record (photo) on the prov website is dated 30/11/1956 i.e. https://prov.vic.gov.au/archive/0FF228DD-F7F4-11E9-AE98-BD191AD9064B?image=1

NOTE! The prov website coll;ection searcvh for using a facet date range of 1956-1957 does return a result i.e. https://prov.vic.gov.au/search_journey/select?keywords=dawn%20fraser&start_date=1956&end_date=1957&rows=10&iud=true&modifier=ALL_WORDS

Conal-Tuohy commented 2 years ago

Allow the HTML element to specify the comparison operator to be used, e.g.

<input type="date" name="start_dt" class="metadata ge">
<input type="date" name="end_dt" class="metadata le">

I would use the names of comparison operators defined in the XPath specification, because in that spec they have versions which are purely alphanumeric and don't use mathematical symbols, which makes them usable as the value of HTML class attributes. In XPath the operators are:

operator name meaning
gt greater than
lt less than
ge greater than or equal to
le less than or equal to
asaletourneau commented 2 years ago

Impressive!

asaletourneau commented 2 years ago

After more testing on the dsates facet I can confirm the following results website search results dawn fraser show 13 records for 1956-1965 https://prov.vic.gov.au/search_journey/selectkeywords=dawn%20fraser&start_date=1956&end_date=1965&rows=10&iud=true&modifier=ALL_WORDS&facet=Photographic%20Negatives%20[1956%20Melbourne%20Olympics%20Photograph%20Collection]&key=series&field=Series_title_facet

API query form query dawn fraser 0 record for 1956-1965 http://prov.conaltuohy.com/search/query?wt=xml&q=(text%3A%22dawn%20fraser%22)%20AND%20(start_dt%3A%5B*%20TO%201956%5C-01%5C-01%5D)%20AND%20(end_dt%3A%5B1965%5C-01%5C-01%20TO%20*%5D)

I know there are many dummy dates for records but for these dawn fraser records we actually do have dates and it'd be nice to capture them. What is of a concern is that following Dasniel Wilkjsch's syntax for an api date range query https://api.prov.vic.gov.au/search/select?q=category:Item%20AND%20series_id:10742AND%20date_range:[1956%20TO%201965] we return 0 results!

My final finding is https://api.prov.vic.gov.au/search/select?q=category:Item%20AND%20text:(%20fraser)AND%20date_range:[1917%20TO%201917] does return 396 results using Daniel's cheat sheet syntax but that the API query form for the same date and keyword returns 12 records http://prov.conaltuohy.com/search/query?csv.mv.separator=%7C&wt=xml&q=(text%3A%22fraser%22)%20AND%20(start_dt%3A%5B*%20TO%201917%5C-01%5C-01%5D)%20AND%20(end_dt%3A%5B1917%5C-12%5C-31%20TO%20*%5D)%20AND%20(iiif-manifest%3A(*))

Conal-Tuohy commented 2 years ago

What is of a concern is that following Dasniel Wilkjsch's syntax for an api date range query https://api.prov.vic.gov.au/search/select?q=category:Item%20AND%20series_id:10742AND%20date_range:[1956%20TO%201965] we return 0 results!

The query URL has a missing space (or %20) after the series_id clause. This query works: https://api.prov.vic.gov.au/search/select?q=category:Item%20AND%20series_id:10742%20AND%20date_range:[1956%20TO%201965]

Conal-Tuohy commented 2 years ago

API query form query dawn fraser 0 record for 1956-1965 http://prov.conaltuohy.com/search/query?wt=xml&q=(text%3A%22dawn%20fraser%22)%20AND%20(start_dt%3A%5B*%20TO%201956%5C-01%5C-01%5D)%20AND%20(end_dt%3A%5B1965%5C-01%5C-01%20TO%20*%5D)

The problem here is that there was a bug in the form markup. I had marked up the start_dt date field with a class of end and the end_dt field with a class of start, so the form was misinterpreting the semantics of the start and end dates. The query would return records whose start date was BEFORE the start date entered, AND whose end-dates were AFTER the end date entered. Obviously there were no records with such contradictory start and end dates. I fixed up the form so that start_dt has a class of start and end_dt has a class of end and it appears to be fixed.

I created the query using the updated form and got 13 results:

http://prov.conaltuohy.com/search/query?wt=xml&q=(text%3A%22dawn%20fraser%22)%20AND%20(start_dt%3A%5B1956%5C-01%5C-01%20TO%20*%5D)%20AND%20(end_dt%3A%5B*%20TO%201965%5C-01%5C-01%5D)%20AND%20(iiif-manifest%3A(*))