biodiversitydata-se / SBDI4R

R package to search and access data made available through the Swedish biodiversity data infrastructure SBDI
https://biodiversitydata-se.github.io/SBDI4R/
GNU Affero General Public License v3.0
1 stars 2 forks source link

Add filters for coordinates uncertainty #32

Closed aleruete closed 3 years ago

aleruete commented 3 years ago

can this be added to pick_filter?

aleruete commented 3 years ago

@shahmanash I am trying to filter by e.g. coordinate_uncertainty that is indexed and "double". How could I format the query to filter larger than (>)? it works fine with &fq=coordinate_uncertainty:100 for exact values.

shahmanash commented 3 years ago

For greater than 100 fq = "coordinate_uncertainty:[100 TO *]" For less than 100 fq = "coordinate_uncertainty:[* TO 100]" Between 200 to 300 fq = "coordinate_uncertainty:[200 TO 300]"

A quick summary on SOLR queries https://blog.imaginea.com/solr-query-equivalent-to-the-database-query/

aleruete commented 3 years ago

20

aleruete commented 3 years ago

@DeboraArlt the pick_filter() function could gain the type "indexed" returning all indexed columns and allow the user to create a query. It coul also easily translate R syntax into SOLR syntax.

We can't, however, list all potential answers to all indexed variables... as we do for institutions.

aleruete commented 3 years ago

@shahmanash some indexed fields seem to loop in searches that never end (e.g. county, fq=county:Uppsala). I used :"Uppsala". :%22Uppsala%22 with no success. However, the spatial search using fq=cl10097:Uppsala works fine.

shahmanash commented 3 years ago

fq = "county:Uppsala"is equivalent to https://records.bioatlas.se/ws/occurrences/search?fq=county:Uppsala

fq = "cl10097:Uppsala" is equivalent to https://records.bioatlas.se/ws/occurrences/search?fq=cl10097:Uppsala

Please note that the search fq = "cl10097:Uppsala" is based on the layer https://spatial.bioatlas.se/ws/layers/view/more/lan
This request is fundamentally different from earlier string based search fq = "county:Uppsala"

During ingestion of data, records are spatially processed against geographical layers, for example records with coordinates which fall within the polygon Uppsala of the layer https://spatial.bioatlas.se/ws/layers/view/more/lan (10097), have the value **cl10097:Uppsala**in database and SOLR index.

Records having string Uppsala in the county field are fetched by the request fq = "county:Uppsala" irrespective of whether or not they have geo-coordinates that fall within Uppsala. For example if a historical record has value Uppsala in the county field but does NOT have geo-coordinates, it will be fetched by fq = "county:Uppsala" but not fq = "cl10097:Uppsala"

DeboraArlt commented 3 years ago

@DeboraArlt the pick_filter() function could gain the type "indexed" returning all indexed columns and allow the user to create a query. It coul also easily translate R syntax into SOLR syntax.

We can't, however, list all potential answers to all indexed variables... as we do for institutions.

@aleruete No, we cannot list all potential answers. But getting a list of all indexed fields is useful so the user can know which columns can be queried. But this info we already get with the sbdi_fields, right?