opencaching / opencaching-pl

The source code of Opencaching.PL (and some other domains)
https://opencaching.pl/
GNU General Public License v3.0
22 stars 33 forks source link

Regions Level-3 #103

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
There are 55 regions in the nuts_codes table that match code like 'RO%'.
There are 55 regions in the nuts_layers table that match code like 'RO%'.
Joining the tables using code produces 55 results, so the correspondence is 
correct. 

However only 8 regions are displayed in the dropdown box. 
Most important problem: none of the counties are in it.
Pressing zone from coordinates yields empty results. 

See already published caches. They should all match Argeş county, that is 
region RO311. None do so. 

Original issue reported on code.google.com by andrixnet on 9 Apr 2014 at 7:56

GoogleCodeExporter commented 9 years ago
regions displayed must follow European Union level NUTS-2, and
Romania (following
http://en.wikipedia.org/wiki/Administrative_divisions_of_RomaniaRomania)
has 8 regions NUTS-2, so actually it works as designed.

2014-04-09 9:56 GMT+02:00 <opencaching-pl@googlecode.com>:

Original comment by wloczynutka on 9 Apr 2014 at 9:39

GoogleCodeExporter commented 9 years ago
can it be made to work at NUTS_3 level? 
See here: http://en.wikipedia.org/wiki/Counties_of_Romania

For romanians, the NUTS_2 regions have no meaning and users will have much 
problem relating to them, it being a mandatory field for a cache. 

And there is already political controversy in our government on redefining 
administrative organization of the country. 

Original comment by andrixnet on 9 Apr 2014 at 9:47

GoogleCodeExporter commented 9 years ago
It can be made, but it must be done only for selected countries (because
regions of PL, B, L and NL works perfect witch nuts_2 layer)
Make it configurable seems very complex and time consuming work, at the
moment We have not spare programmers to perform that complex code change.

So I suggest close this issue as unreal to do.

2014-04-09 11:47 GMT+02:00 <opencaching-pl@googlecode.com>:

Original comment by wloczynutka on 9 Apr 2014 at 10:21

GoogleCodeExporter commented 9 years ago
I've changed priority to low, type to enhancement. 
It is a must for Romania, for the reasons explained above. 

I understand the resources required and I and a friend will also look into it 
some time later. 
Thank you. 

Original comment by andrixnet on 9 Apr 2014 at 3:43

GoogleCodeExporter commented 9 years ago
When you start work on it, to keep integrity I suggest keep current regions 
level-1 (country) and level-2 (districts) and add third selector for level-3 
regions.

as I said this is complex task.

requirments: 
================================================================================
==
- changes must apply on every page witch region selecting: especially newcache, 
editcache, search pages and all cache display pages (viewcache, main page, 
newest cache page, myneighbourd pages and so on.)
- level-3 should be active only for countries specified in config.
- level-3 have to be loaded on the fly (without page reload, by ajax 
technology) on events:
a) region-2 is selected by user 
b) coordinates are loaded from gpx (newcache)  
c) on user entered coordinates from keyboard.
on event 2 and 3 autaomatic select proper region from given coordinates.(using 
the same script as for level-2, maybe need add few lines of code) 

Original comment by wloczynutka on 9 Apr 2014 at 7:35

GoogleCodeExporter commented 9 years ago
Understood. It is indeed a complex task. 
I will set up a copy of my node as development where I will perform the more 
complicated programming. 

Thank you for the details, they will be most helpful. 

Original comment by andrixnet on 9 Apr 2014 at 8:44

GoogleCodeExporter commented 9 years ago
Question: isn't region supposed to be filled automatically from coordinates: 

This is what I get for any cache:

 Region from coordinates:
N 44° 52.096'  E 24° 49.158'
Country

Region found by Opencaching Romania

Region found by Google Maps

=====================
no results when pressing "region from coordinates"

Original comment by andrixnet on 10 Apr 2014 at 8:47

GoogleCodeExporter commented 9 years ago
this button only opens the window.
Automatic filling country/region is done other way (via ajax) and on events 
(when user fill coords or load it from gpx), 
but this is not implemented on editcache I think, but works on newcache.

Original comment by wloczynutka on 10 Apr 2014 at 9:21

GoogleCodeExporter commented 9 years ago
My non-working testcase was from newcache.

Original comment by andrixnet on 10 Apr 2014 at 10:36

GoogleCodeExporter commented 9 years ago
I meant: 
- button only displays small window, do not taka any action on both edit/new 
cache page
- automatic fill region data is done by ajax scripts, only on new cache page.

Original comment by wloczynutka on 11 Apr 2014 at 9:22

GoogleCodeExporter commented 9 years ago
I'd like to ask about this issue status: 7 months with no action. I suggest to 
close this issue as abandoned.

Original comment by wloczynutka on 5 Nov 2014 at 10:20

GoogleCodeExporter commented 9 years ago
I had little progress on this, so far, but not significant. 
There were/are other issues with higher importance.

However this is very significant for our country and should remain open until 
solved.
Thank you.

Original comment by andrixnet on 6 Nov 2014 at 8:40

GoogleCodeExporter commented 9 years ago
Note: in table cache_location only adm1 and adm3 are filled (at this time). 
All adm1, adm2, adm3, adm4 should be filled to correctly acomplish this 
feature. 

Further investigation is necessary on how to correlate existing data with 
proper fields.

Original comment by andrixnet on 7 Nov 2014 at 10:26

kojoty commented 7 years ago

@andrixnet, please review this and if necessary add summarize what should be done here and reopen then. Thanks.

andrixnet commented 4 years ago

Most OC sites use NUTS layer 3 data, as this is relevant for their country's context. However relevant region data for Romania is equivalent to data in NUTS layer 4.

andrixnet commented 4 years ago

According to my research: Countries are at level 0 (numbering in table) macro regions are at level 1 provinces are at level 2 (most EU sites, CA, US, MX) counties are at level 3 (required by RO since romanian "judeţ" is at this level)

Current code recognizes provinces at level 2 only (tested with new data for US).

kojoty commented 4 years ago

@andrixnet Please answer my question in context of OCUS:

Do OCUS need all 3 levels (country+state+county) or only set of 2 from those three... Or maybe we should merge states and counties and publish something like: US - CA /San Francisco (county)

kojoty commented 4 years ago

@andrixnet , @deg-pl, @harrieklomp FYI as well

my next update to this issue - draft of my idea:

When all will be ready (hope after the weekend) we will be able to migrate all nodes to this.

andrixnet commented 4 years ago

Since our code only presents the user only 2 informations: country and region, what you describe above doesn't seem to simplify it.

Existing data already has global unique code for country and region (such as US = USA (country) and US12=Kansas (state/province). Or PL = Poland and PL12=Lubuskie.

Haven't seen (so far) any interest in macro regions such as "south-west", so that level of data would it really be necessary to have and process?

_Also, haven't seen any interest at county level (where much more geo data is required). (example: Słubice County of Lubusz Voivodeship)_

The same goes for San Francisco County of California ... (from above example CA/San Francisco)

The equivalent for RO according to existing data: RO = Romania and RO123 = Arges. I know the english terminology is "county", but this is the same as polish Voivodeship, german Land and US state. At RO12 we have "South" for example, which is of zero relevance.

(region codes are not actual, just examples).


Thus I see two ways of handling this:

  1. keep geo data, change PHP code:

    • keep geo data and region codes as they are.
    • PHP code handling countries probably does not need changes
    • PHP code handling regions needs to:
    • make use of all 4 levels of data (if exists) as it is now (though some will be useless)
    • make the level configurable (for each site)
  2. process geo data, change PHP code:

    • geo data to contain, for each country, only 2 levels: country outline and province/state outline.
    • country outline code is already 2 characters long, unique, based on ISO
    • province/state outline may be variable in length (depending on country), but at least 3 characters long. (usually 4 or 5)
    • level column is no longer needed (just use code length as key)
    • adm/code 1-4 columns are no longer needed (and the PHP code handling them)
    • caches table can have 2 columns to reference country and province/state

Both scenarios may also handle US ZIP codes using :

  1. a dedicated table with shape data
  2. integrate shape data into country/region table and use a separate column with ZIP code (country/region data having this column as NULL)
rapotek commented 4 years ago

Long ago I recommended using iso3166 -1 and -2 codes (even made some tests in the subject). They do not include US counties, but may be sufficient for the rest of the world and counties may be imported otherwise. As for level in proposed table: why not base on OSM data, and use its "admin_level" attribute associated with border shapes? The attribute exists regardless of iso, nuts, fips or ansi values.

kojoty commented 4 years ago

@andrixnet I think I was not explained it well and we don't understand each other :)

My general opinion is that we need 3 levels: L0=country + L1=state + L2=local-region

For most countries like Pl or RO L1 is not necessary - for PL: L0=PL + PL12=Lubuskie is enough

But for US where Teksas is probably bigger than Poland and Romania together would be good to still have third level like: L0=US (country) L1=USTX (state) L2=Harries (county) - this is the only reason why I think about the third level (L1) - without this we can just use two levels: country + region.

For most countries we can skip L1 and have just only L0+L2.

I see two cases when we need these data:

We can chat over IM or call to find the right vision :)

andrixnet commented 4 years ago

AFAIK in USA states are very important and ZIP code is more important then county. But let's take it up with them.

kojoty commented 4 years ago

Long ago I recommended using iso3166 -1 and -2 codes (even made some tests in the subject). They do not include US counties, but may be sufficient for the rest of the world and counties may be imported otherwise. As for level in proposed table: why not base on OSM data, and use its "admin_level" attribute associated with border shapes? The attribute exists regardless of iso, nuts, fips or ansi values.

@rapotek, please point me to the source of OSM data - maybe it would be a good idea...

ISO code-s is not a problem in my opinion - For countries we used ISO codes now (two letters version). For regions we use the codes internal only so this is not the case - if we use gov. data (I've found a good source of shapefiles for EU, US, CA, MX for now) it much easy to rely on codes from that files.

kojoty commented 4 years ago

@andrixnet ZIP-codes is a separate case we can add it later - I think it should be stored in different table etc.

BTW. in my opinion this is a little strange case of usage: do you really need to search geocaches by post-code? Because of why?

andrixnet commented 4 years ago

@kojoty if we include county data or macro region data, the next step will be searching by that criteria (I'm not saying it, someone will definitely ask for it though).

Regarding ZIP codes:

rapotek commented 4 years ago

@kojoty just check https://overpass-turbo.eu/ for live example. Sample queries:

The interesting results should be in 'Data' tab. Change tags to geomand you will have a shape data, matching exactly boundaries marked on openstreetmap.

The same queries work in real overpass, which should be used by software: http://overpass-api.de/api/interpreter

kojoty commented 4 years ago

@rapotek but we definitely need offline data: we need to be able to do for example:

If we relay on external api and store the results for each geocache we can't be sure that we have consistent data (because external api data can be changed in time...)

...,but if you know how to get the offline copy of OSM data with shapefiles - it would be useful for us as a source of geo-data for table.

kojoty commented 4 years ago

@kojoty if we include county data or macro region data, the next step will be searching by that criteria (I'm not saying it, someone will definitely ask for it though).

Regarding ZIP codes:

OK, we can support ZIP codes as well - for me this is not a problem - we need only the source of zipcode data - and let's do in in next step - not now.

andrixnet commented 4 years ago

I agree on the point that we need our own offline copy of the data, for an open-source, volunteer driven project to rely on third party on-line service providers is not feasible (in the long run).

@kojoty let's follow up with ZIP codes here: https://github.com/opencaching/opencaching-pl/issues/865

rapotek commented 4 years ago

@kojoty As I wrote you some time ago, the whole set of iso3166-2 shapes available in OSM weights about 500 MB. I have a sample software to get it all, but it is a bit outdated now and will not work correctly. For example: it bases on old iso3166-2 codes retrieved from a python library, whereas in OSM there are new codes for Poland, being in effect since 2018/11/26.

The idea was to use two tables:

CREATE` TABLE iso3166_codes(
    code VARCHAR(6) NOT NULL PRIMARY KEY COMMENT 'ISO 3166-1 alpha-2 code or ISO 3166-2 code with hyphen',
    continent VARCHAR(2) COMMENT 'Two letter continent code, may be NULL',
    country VARCHAR(2) NOT NULL COMMENT 'ISO 3166-1 alpha-2 country code',
    subdivision VARCHAR(3) COMMENT 'ISO 3166-2 administrative subdivision code part, may be NULL in case of full country',
    parent_code VARCHAR(6) COMMENT 'ISO 3166-2 full code of subdivision parent',
    last_modified DATETIME NOT NULL COMMENT 'Last modification date and time',
    INDEX(country)
) ENGINE InnoDB;

CREATE TABLE iso3166_shapes(
    code VARCHAR(6) NOT NULL PRIMARY KEY COMMENT 'ISO 3166-1 alpha-2 code or ISO 3166-2 code with hyphen',
    shape GEOMETRY NOT NULL COMMENT 'Geometry of country or subdivision boundaries',
    min_lon FLOAT(10, 7) COMMENT 'Minimal longitide of code area shape',
    min_lat FLOAT(9, 7) COMMENT 'Minimal latitude of code area shape',
    max_lon FLOAT(10, 7) COMMENT 'Maximal longitide of code area shape',
    max_lat FLOAT(9, 7) COMMENT 'Maximal latitude of code area shape',
    last_modified DATETIME NOT NULL COMMENT 'Last modification date and time',
    SPATIAL INDEX(shape),
    FOREIGN KEY(code) REFERENCES iso3166_codes(code) ON UPDATE CASCADE ON DELETE CASCADE
) ENGINE InnoDB;

where continent is for future use, min_lon, min_lat, max_lon, max_lat are to significantly improve time while searching by given point, parent_code is for example for UK, where iso3166-2 codes are multi-leveled.

andrixnet commented 4 years ago

From which to actually use only the countries and subdivisions needed. 500MB as OSM data? (might be different in SQL)

rapotek commented 4 years ago

@andrixnet a folder with SQLs ready to import: 556 MB, a mysql db folder after import: 434 MB. But it might be different for up-to-date data.

kojoty commented 4 years ago

if you know where i can download this OSM data I would like to take a look closer on this - maybe we can just filter out some of the data from that - but I need to review it

rapotek commented 4 years ago

@kojoty Up until now I cannot locate original sql files with downloaded data (backuped it somewhere and forgot), but I will send you an email with link to gzipped mysqldump of above tables. Remember, codes and shapes were valid on 2018/11/11, where on 4835 iso3166-2 codes only 4206 shapes were available to download.

kojoty commented 4 years ago

@kojoty Up until now I cannot locate original sql files with downloaded data (backuped it somewhere and forgot), but I will send you an email with link to gzipped mysqldump of above tables...

cool - let me review this - thanks.