biodiversitydata-se / SBDI4R

R package to search and access data made available through the Swedish biodiversity data infrastructure SBDI
https://biodiversitydata-se.github.io/SBDI4R/
GNU Affero General Public License v3.0
1 stars 2 forks source link

search for species observations using threshold values #24

Open DeboraArlt opened 3 years ago

DeboraArlt commented 3 years ago

similar to coordinate uncertainty (#20): user want to get only observations above or below a threshold, e.g. count>50

aleruete commented 3 years ago

I dont see any serach field similar to count, else is possible to filter by range similar to count:[50 TO *]

shahmanash commented 3 years ago

Do you mean you wish to filter based on the following DarwinCore term https://dwc.tdwg.org/terms/#dwc:individualCount

As you can see from the list of indexed fields https://records.bioatlas.se/ws/index/fields

the term individualCount is not indexed but can be done if necessary.

DeboraArlt commented 3 years ago

@shahmanash Would be nice if individualCount could be indexed, if it's easily done.

shahmanash commented 3 years ago

fq=individual_count:100

https://records.bioatlas.se/occurrences/search?fq=individual_count:100

https://records.bioatlas.se/ws/index/fields { "name": "individual_count", "dataType": "string", "indexed": true, "stored": true, "multivalue": false, "docvalue": false, "description": "Individual Count", "info": "http://rs.tdwg.org/dwc/terms/individualCount", "dwcTerm": "individualCount", "classs": "Occurrence" }

The field is indexed as string , so it is not possible to do a range search on it. It might require changing the SOLR schema and any underlying logic if dataType is to be changed.

DeboraArlt commented 3 years ago

@shahmanash It is the SOLR schema that defines dataType as string? what decides that dataType for individualCount is set as string? content seems to be numbers (always?) so it really should be indexed as number?