MushroomObserver / mushroom-observer

A website for sharing observations of mushrooms.
https://mushroomobserver.org
MIT License
77 stars 25 forks source link

iNat Location #2218

Open JoeCohen opened 2 weeks ago

JoeCohen commented 2 weeks ago

Covert iNat Location to MO Location when importing iNat observations iNat observations have a location (lat/lng point) and positional accuracy (in meters). Convert this to the MO Location that is the MBR (Minimum Bounding Rectangle) of the iNat Location +/- public_positional_accuracy.

Tasks

JoeCohen commented 2 weeks ago

Get list of locations containing lat/lng + public accuracy Create a box. Maybe mappable box methods will help, as I need to add meters to lat, lng n = max (lat + public accuracy, 90), s = max (lat - public accuracy, 90), e/w get tricky because public accuracy can push box to other side of 180. Maybe first mbr_area = infinity mbr = nil each loc loc area = calculate are next unless loc_area < mbr_area

mbr = loc mbr_area = loc_area end

Perhaps I can partly piggy-back on app/classes/mappable/box.rb, app/classes/mappable/box_methods.rb

JoeCohen commented 2 weeks ago

Andrew Nimmo The BoxMethods class computes area, and the AutoComplete::ForLocationContaining class sorts by this method. You could use that as an example.

https://mushroomobserver.slack.com/archives/C040TH9FV/p1720559425916179

nimmolo commented 1 week ago

If the iNat obs contains only a GPS point and it doesn't match an existing MO Location, we probably do not need to create an MO location. (What good would an automatic fudged-area Location be, anyway? It's like a less precise way of describing an obs point; it seems no one else would choose to use such a Location, I'm thinking.)

Also, Observations do not require Locations. They should show up on occurrence maps without a Location association, although that is temporarily not the case.

Suggestions -

JoeCohen commented 1 week ago

@nimmolo: Thanks for the suggestion. I'll give it some thought. I agree that polling Google is generally not worth the effort; it might be worth it if the bounding boxes would otherwise be too large. Some issues:

Can you clarify: "Some iNat obs presumably have non-point Locations, and these are the ones we should be importing (according to this suggestion)"? (We need to import whichever iNat observations that MO users want to import. And, IMO, each imported obs needs to have an MO Location. We don't need to import the iNat place. (And doing so would be complex, as the places can have complex polygonal boundaries.)

nimmolo commented 1 week ago

Interesting.

Sounds like we should be taking the smallest encompassing iNat "Place" and getting the name and bounding box of that. (Any geometry column should also have the MBR available in the data list, it's part of the spec.) We could ignore the other geometry of the place. The place is the one that's likely to have a meaningful, searchable Name (like a county or park, or whatever), and this name may even correspond to an MO Location.

The most attractive option to me, since in my mind we are moving towards using geometry columns for Locations in the near future, would be to import the iNat Place geometry into MO without necessarily using it yet. If we move to the geometry data type for obs and locations, observations will no longer need an associated Location, only a lat/lng. The user could search for a Location like "Cochise Co., Arizona, USA" and the database query could pretty easily compute all observations whose GPS point, or associated Location bounding box, lies within that geometry. (The geometry can be a bounding box or a rectangle, it doesn't matter. It can also start as a bounding box and be later converted to a polygon.) That would be a significant improvement for MO, where the obs must currently be explicitly associated with Cochise Co., not one of the towns within it, to show up in such a search. This is how iNat is doing it, i would imagine.

To arrive at an algorithm for selecting among the iNat Places may seem tricky, but i would guess the "standard" types are probably the Google types that i'm picking through with the Google place matching. (If there's anything called administrative_level_3 etc, then these are most likely Google's map place categories.)

I don't think we have a use for the "accuracy" stuff yet, although it would be good if we did. I still feel like an MBR produced from their accuracy radius is little better than fudging points, it doesn't really add to the accuracy or searchability.

JoeCohen commented 1 week ago
nimmolo commented 1 week ago

There is a way to get an MBR from the Place polygon, even if that polygon is not a box (i.e. rectangle). There's both a db-native way, and mathematical-function way.

(I'm still feeling like one of iNat's Places is likely the most relevant "Location" we'd want to save for our users, because it's likely to conform to the kind of string a user would enter, as a generally-recognized name of an area.)