opencaching / opencaching-pl

The source code of Opencaching.PL (and some other domains)
https://opencaching.pl/
GNU General Public License v3.0
22 stars 33 forks source link

NUTS => geonames.org data #1977

Open kojoty opened 5 years ago

kojoty commented 5 years ago

OCPL code now uses NUTS data to provide for example list of regions/states/provinces for given country, but NUTS contains data for Europe only - we should migrate to data imported from http://www.geonames.org/ - it seems that there are quite OK data for all the world.

I will prepare scripts for import only data used by OCPL and the solution based on that.

kojoty commented 5 years ago

related issues:

kojoty commented 5 years ago

@andrixnet @harrieklomp @deg-pl as part of this issue I've refactored countries list from DB -> files.

DB countries table has the list of "default" countries - list of countries which are on the list by default - before click on "all countries" which load the rest of that.

Now default countries list is in config /config/site.*.php under: $site['defaultCountriesList'] = ['DE', 'NL', 'PL', 'RO', 'GB'];

I add same list for all the nodes now - but I believe you should modified it according to your needs.

andrixnet commented 5 years ago

Thank you. I shall review it for RO and set up accordingly.

andrixnet commented 5 years ago

Question about data from geonames: importing data from them for offline local use or online via webservice?

andrixnet commented 5 years ago

Doing some tests at OCNA using -current code, even though NUTS table is not filled properly, I can see US states correctly. Should I presume this is through geonames.org?

Second, at OCRO we need layer level 3 data. That is: N 44° 51.211' E 24° 51.078'

Wide regions such as "Sud - Muntenia" make little sense to us. Romania is structured around "judeţ" which is equivalent to German "land" or US "state", though most often translated into english as "county". Romania's actual "county" equivalent to US definition is called "comună" which is an administrative collection of villages and their surrounding areas around a principal locality. Briefly mentioned here as well: https://github.com/opencaching/opencaching-pl/issues/1838#issuecomment-455765448

Assuming geonames.org as data source, how can we set the level at OCRO?

kojoty commented 5 years ago

@andrixnet

Nope, Let's forget about geonames. In the meantime I discovered that EU published the new NUTS data (aka 2018) and there are for example country shapes for most of the world (previously NUTS data contains only country shapes for EU)

I also discovered that geonames data quality is variant...

Conclusion: For EU + countries of the world NUTS data are ok - I will import the latest NUTS data. For OC NA we need another sources of data in similar format - for now we can use data used previously by OCNA - I assume this is compatible with our current data and code should work with it without any significant changes.

andrixnet commented 5 years ago

@kojoty OK, let's forget about geonames. I have the following BIG unknown: On OCNA dev (to become production within a few days) I have filled the nuts_codes and nuts_layer tables filled with CA and MX data only, and only level 2 data (that is CA01, etc, like we currently use in europe). _Note: data is also consistent as POLYGON and MULTIPOLYGON (so the STWITHIN MySQL/MariaDB function returns correct results without the need for additional PHP code to walk the boundary).

At this time there is no US data in nuts_codes and nuts_layer.

But if I try to publish a test cache, I get the following results:

Where do those US states come from in the drop-down?

rapotek commented 5 years ago

@kojoty Where did you get nuts region shapes for the whole world? ec.europa.eu site contains only European countries and regions (checked csv, LB and RG files).

When moving to NUTS 2018, the Polish translation "województwo" for level 2 region is no more valid, because we have 16 regions of "województwo" status in Poland and there are 17 level 2 regions in NUTS 2018, with "Warszawski stołeczny" and "Mazowiecki regionalny" regions instead of single region corresponding with existing (NUTS 2013) "Mazowieckie" region. The codes for another regions have changed as well, f.ex. "Łódzkie" moved from PL11 in NUTS 2013 to PL71 in NUTS 2018. What about existing cache_location entries and dependent services like titled cache?

And finally: geonames.org uses ISO 3166-2. While the service itself data quality may be insufficient, ISO 3166-2 is a step in the right way in my opinion, as it really is meant to cover the whole world.