AtlasOfLivingAustralia / ALA4R

Access data and resources hosted by the Atlas of Living Australia (ALA)
https://atlasoflivingaustralia.github.io/ALA4R/
42 stars 8 forks source link

Filtering `occurrences` by date with `fq` #13

Closed johnbaums closed 9 years ago

johnbaums commented 9 years ago

occurrences does not play nicely with date ranges passed to occurrence_year via fq, such as:

occurrences(taxon='lsid:urn:lsid:biodiversity.org.au:afd.taxon:ba8d0c3b-9753-46cf-87b4-a1b9ec290634',
            fq='occurrence_year[2000-01-01T00:00:00Z TO 2020-01-01T23:59:59Z]',
            record_count_only=TRUE)
## ...
## Error in check_fq(fq, type = "occurrence") : 
## invalid fields in fq: occurrence_year[2000-01-01T00, 2020-01-01T23. See ala_fields("occurrence_indexed")

yet http://biocache.ala.org.au/ws/occurrences/search?q=lsid:urn:lsid:biodiversity.org.au:afd.taxon:ba8d0c3b-9753-46cf-87b4-a1b9ec290634&fq=occurrence_year:[2000-01-01T00:00:00Z%20TO%202020-01-01T23:59:59Z]&pageSize=0 returns the expected, date-filtered result.

URL encoding the spaces and/or colons in the occurrences call doesn't get around this (returns count=0).

Am I using the wrong incantation here, or might check_fq be adjusted to permit this type of date filtering? Skipping over check_fq results in a working url being constructed, and the correct count (35693, today at least) being returned.

raymondben commented 9 years ago

There's a minor typo in your R command: you need a colon after occurrence_year. Nevertheless, it still doesn't work because check_fq doesn't handle ranges. Obviously it needs to have more sophisticated parsing: as a workaround for the time being I've just changed check_fq to issue a warning rather than an error (v1.19)

raymondben commented 9 years ago

fq parsing is now more robust (v1.191), but probably not foolproof - so I've left it as issuing a warning rather than an error

johnbaums commented 9 years ago

Nice one, thanks Ben.