Open pleary opened 7 years ago
So I guess the idea here would be to modify InaturalistAPI.lookupPreferredPlaceMiddleware
to do something like look up places in the database by their admin_level
and code
attributes.
I started investigating this issue and I have a few concerns:
code
field is absent in elasticsearch indexes for places;code
field is filled only for US states and is empty for countries.
Is code
field used for countries on non-test environments?
Which approach is better for searching by code
: by adding it to elasticsearch indexes or by selecting from the db?
Is code field used for countries on non-test environments?
Generally yes, especially for countries. You can see this using the old Rails-based JSON endpoints, e.g. https://www.inaturalist.org/places/russia.json, where the code
field is set to RU
.
Which approach is better for searching by code: by adding it to elasticsearch indexes or by selecting from the db?
IMO, since this is a pretty quick lookup and we're not planning on using it for search, I would fetch it out of the database. If that becomes a performance problem, we could add it to elasticsearch later. @pleary do you have an opinion on this?
I’ve noticed that ancestry
field in places
table looks inconsistent: in Node.js test seeds for postgres (fixtures.js
) it includes id of the current item:
{
"id": 222,
"name": "California",
"ancestry": "111/222"
},
In test db and in rails code it doesn’t contain id of the current record, only the id of the parent record; id of the current record is pushed to ancestor_place_ids
during processing:
id | name | ancestry
-----+------------+----------
297 | California | 17
Should I consider that ancestry
field contains only parent ids, without current id?
Weird, that's probably a problem with the fixture, so yes, assume the ancestry
field only contains ancestor IDs, not the ID of the record itself.
One unexpected consequence of this that we need to figure out is that due to the fact that we are prioritizing names in a place over names matching a locale without a place, people requesting names in en-HK
are getting Chinese names in Hong Kong when an English name exists but lacks a place association. For example, when you request https://api.inaturalist.org/v1/taxa/627207?locale=en-HK, the preferred_common_name
is "大頭茶." I'm going to temporarily disable this until we figure that out. IMO, the right solution is to change the way we prioritize the names, but I think that's going to have some other unexpected and maybe more-widespread consequences, which we should probably just deal with... when we have the bandwidth. Alternatively, we could give lower weight to the places we extract from the locale code.
Some backstory regarding our current name priority is at https://groups.google.com/u/1/g/inaturalist/c/P8iNMY0WYNM/discussion
The local param can include a locality code in addition to the language code (e.g.
en-NZ
ores-MX
). We could lookup the iNaturalist place equivalents to the location portion of the code and use that as a default preferred_place, (e.g.en-NZ
sets a default preferred_place_id of 6803, New Zealand's iNat place_id).I suggest the order of precedence of preferred_place from least to most important: