gbv / jskos-server

Web service to access JSKOS data
https://coli-conc.gbv.de/api/
MIT License
6 stars 4 forks source link

Support geospatial search #88

Closed nichtich closed 1 year ago

nichtich commented 4 years ago

Given a geographical coordinate, get a list of concepts with location near this coordinate.

See geospatial queries in MongoDB for implementation. Coordinate could be passed as additional query parameter near at endpoint /voc/concepts with a comma-separated pair of longitude and latitude, e.g. /voc/concepts?near=-49.94,41.73

An optional parameter distance to only return concepts within a given distance (in km) may be helpful as additional feature on top.

stefandesu commented 4 years ago

A few notes:

The actual implementation for this should be really simple.

nichtich commented 4 years ago

Thanks for feedback.

stefandesu commented 4 years ago

The JSKOS records will include GeoJSON with reverse order!

Maybe that's what you mean, but I think we shouldn't go against GeoJSON specification. It requires longitude, latitude in that order. I'm pretty sure swapping the two will break distance calculation in MongoDB as well.

I'm fine with latitude, longitude for the query parameter, but we should (and have to) keep to the specification (longitude, latitude) for GeoJSON.

  • Set a default value for $maxDistance and distance in meters instead of km. Maybe 1000m as default?

I would have chosen a larger value, maybe 10000m, otherwise there often won't be any results, right? Or maybe you're right and 1000m would be better because Wikidata will probably have lots of concepts with coordinates and 1000m would be enough to return a lot of them.

stefandesu commented 4 years ago

Interesting related tidbit I found on https://macwright.org/2015/03/23/geojson-second-bite.html#position:

A position is an array of coordinates in order: this is the smallest unit that we can really consider ‘a place’ since it can represent a point on earth. GeoJSON describes an order for coordinates: they should go, in order:

[longitude, latitude, elevation]

This order can be surprising. Historically, the order of coordinates is usually “latitude, longitude”, and many people will assume that this is the case universally. Long hours have been wasted discussing which is better, but for this discussion, I’ll summarize as such:

  • longitude, latitude matches the X, Y order of math
  • data formats usually use longitude, latitude order
  • applications have tended to use latitude, longitude order

Here’s a handy chart of what uses which ordering.

stefandesu commented 1 year ago

From #211:

GET /concepts endpoint should support query by location with geospatial query for JSKOS field location. See Nomisma Vocabulary for example records. $nearSphere should be enough to get all concepts near a given point on the globe.

Query parameters:

  • near: coordinate (not sure about best syntax, there are several)
  • radius: maxDistance
stefandesu commented 1 year ago

I just tried this out in my local MongoDB using Nomisma data and it seems to work very well (after I fixed the data format used in Nomisma)!

Open question before I can implement this:

@nichtich

nichtich commented 1 year ago

Evaluating several existing similar APIs I found that

Default value for distance could depend on precision of lat/long but better just set to 1km to start with. We can adjust when needed.

So /voc/concepts?near=51.534,9.936 would roughly locate the downtown of Göttingen (GeoJSON Point ["9.9355555555556","51.533888888889"]) and /voc/concepts?near=51.534,9.936&distance=5 would also catch Rosdorf.

stefandesu commented 1 year ago

I've implemented this on the /voc/concepts endpoint.

However, I've noticed that my first intuition was to use the /concepts endpoint, not the /voc/concepts endpoint. The current implementation of the /concepts endpoint does not allow this, however. It returns zero results if neither a uri or notation are given, so it doesn't work equivalent to e.g. the /mappings endpoint (with its various filters). I personally think though that it SHOULD work like that, which would also mean that the response to GET /concepts without any parameters would list all concepts in the database.

What do you think, @nichtich? I think I'll postpone the 2.1.0 release until we've resolved this.

stefandesu commented 1 year ago

The /concepts endpoint should be adjusted so that it'll work as long as ANY parameter is given (not requiring uri or notation specifically).

This also means that the code can be simplified by using the same code for /voc/concepts and /concepts and making /voc/concepts?uri={schemeUri} and /concepts?voc={schemeUri} equivalent.