CartoDB / data-services

CARTO internal geocoder PostgreSQL extension
25 stars 11 forks source link

evaluate openmundi for admin0 synonyms #62

Closed andrewxhill closed 10 years ago

andrewxhill commented 10 years ago

https://github.com/openmundi/world.csv

iriberri commented 10 years ago

I've created this table: https://geocoding.cartodb.com/tables/testing_countries/table which contains the csv file with iso2 codes and names for the countries.

Right now the result is:

Using names 216 out of 249 rows were successfully turned into polygons!

Failed:

Western Sahara
Congo DR
Cook Islands
Bonaire (Sint Eustatius and Saba)
Bouvet Island
British Indian Ocean Territory
Brunei Darussalam
Christmas Island
Cocos (Keeling) Islands
Guadeloupe
Faroe Islands
French Guiana
French Southern Territories
Heard Island and McDonald Islands
Saint Martin (French)
Marshall Islands
Martinique
Northern Mariana Islands
North Korea
Pitcairn
Mayotte
South Korea
Svalbard and Jan Mayen
Virgin Islands (British)
Virgin Islands (US)
Wallis and Futuna
Réunion
Sint Maarten (Dutch)
Solomon Islands
South Georgia and the South Sandwich Islands
United States Minor Outlying Islands
Vatican City
Tokelau

Using ISO codes

237 out of 249 rows were succesfully turned into polygons!

Failed:

Cocos (Keeling) Islands
Christmas Island
Bouvet Island
Guadeloupe
French Guiana
Norway
Mayotte
Réunion
Tokelau
Svalbard and Jan Mayen
Bonaire (Sint Eustatius and Saba)
Martinique
iriberri commented 10 years ago

@andrewxhill I understand I must make sure that all the names appearing in the first block that don't appear in the second one are available through synonyms table. Basically, it means we have the geometry (available by iso2) but we don't recognize it by the correct name.

I've also noticed in the second table Norway appears. I'll check the issue with the iso2 code for this country.

Related: https://github.com/CartoDB/data-services/issues/41

andrewxhill commented 10 years ago

Yeah, because screw Norway!

Nah, jk. Good stuff. Let's try to pick off the obvious ones, more complicated ones we can push lower in priority until the other stuff is complete

iriberri commented 10 years ago

I have run again the geocoding process after adition of synonyms:

228 out of 249 (by name)

Missing:

Bouvet Island
Cocos (Keeling) Islands
**Guadeloupe**
**French Guiana**
Christmas Island
**Martinique**
Pitcairn
Mayotte
**Virgin Islands (US)**
**Réunion**
Sint Maarten (Dutch)
**Vatican City**
South Georgia and the South Sandwich Islands
Virgin Islands (British)
Saint Martin (French)
Tokelau
Svalbard and Jan Mayen
French Southern Territories
Brunei Darussalam
Bonaire (Sint Eustatius and Saba)
**Congo DR**

I'll manually check if we already have some of these regions and add manual synonyms if needed.

iriberri commented 10 years ago

Solved for some of them:

238 out of 249 rows were succesfully turned into polygons!

Added:

I'm seeing that Reunion and French Guiana are not appearing in ne_admin0_v3, although the script for creating them is available. Do I run it, @andrewxhill?

While geocoding by iso2 code instead, we get one less:

237 out of 249 rows were succesfully turned into polygons!

12 adm0 regions left!!! :-)

Missing from iso2 code:

Cocos (Keeling) Islands
Bouvet Island
Guadeloupe
French Guiana
Christmas Island
--- Norway
Martinique
Mayotte
Réunion
Svalbard and Jan Mayen
Bonaire (Sint Eustatius and Saba)
Tokelau

Missing from name:

Bouvet Island
Cocos (Keeling) Islands
French Guiana
Christmas Island
Martinique
Mayotte
Réunion
Svalbard and Jan Mayen
Guadeloupe
Bonaire (Sint Eustatius and Saba)
Tokelau
iriberri commented 10 years ago

Closing this and following the thread here: https://github.com/CartoDB/data-services/issues/80