nvkelso / natural-earth-vector

A global, public domain map dataset available at three scales and featuring tightly integrated vector and raster data.
https://www.naturalearthdata.com/
Other
1.77k stars 368 forks source link

README of the table and field descriptions? #153

Open ppKrauss opened 8 years ago

ppKrauss commented 8 years ago

The README of 10m-admin-0-countries is this HTML... Where the complete data specifications?

There are no table/field description? Example: what the difference between postal and iso_a2? when must be equal and when not? Another table description, "the NULL is -99", where it is specified?

Column Type DESCRIPTION??
gid integer
scalerank smallint
featurecla character varying(30)
labelrank double precision
sovereignt character varying(32)
sov_a3 character varying(3)
adm0_dif double precision
level double precision
type character varying(17)
admin character varying(40)
adm0_a3 character varying(3)
geou_dif double precision
geounit character varying(40)
gu_a3 character varying(3)
su_dif double precision
subunit character varying(40)
su_a3 character varying(3)
brk_diff double precision
name character varying(36)
name_long character varying(40)
brk_a3 character varying(3)
brk_name character varying(36)
brk_group character varying(30)
abbrev character varying(13)
postal character varying(4)
formal_en character varying(52)
... ... ... ?? ...

PS: other "not explained" cases that I wold like understand by peakbagger as suggested.

SELECT iso_a2, iso_n3, postal 
FROM ne10m_units WHERE iso_a2='-99' AND length(postal)=2;

SELECT iso_a2, iso_n3, postal 
FROM ne10m_units WHERE iso_a2='-99' AND iso_n3::int>0

SELECT iso_a2, iso_n3, postal 
FROM ne10m_countries WHERE iso_a2='-99' AND length(postal)=2;
brunob commented 8 years ago

Some infos are available in this topic : http://www.naturalearthdata.com/forums/topic/thematic-codes/

A complete list would be very usefull :)

ppKrauss commented 8 years ago

Thanks @brunob for the feedback (!).

Hum... The problem is not only the information existence (in some hidden point of the universe), but the cost to find information... Today the best practice is to publish data and metadata together, in a standard way. There are two good open standards:

PS: for a quick fix... Your link is not showing the "DESCRIPTION??" column, can you fill here the description?

nvkelso commented 8 years ago

An example of abbrev versus postal is:

name: California abbrev: Calif. postal: CA iso_a2: US (which is the country code, not region code)

On Wed, Aug 10, 2016 at 8:00 AM, b_b notifications@github.com wrote:

Some infos are available in this topic : http://www.naturalearthdata. com/forums/topic/thematic-codes/

A complete list would be very usefull :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nvkelso/natural-earth-vector/issues/153#issuecomment-238894724, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0EO0e47KUQIGmHw5-nzQrn093gtDwxks5qeeeegaJpZM4GVQm8 .

nvkelso commented 8 years ago

One of the stated goals of Natural Earth is not to get bogged down in mind numbing XML metadata, but I hear that these field names aren't making sense to you. Keep the questions coming and I'll answer them, though.

On Wed, Aug 10, 2016 at 9:46 AM, Peter notifications@github.com wrote:

Thanks @brunob https://github.com/brunob for the feedback (!).

Hum... The problem is not only the information existence (in some hidden point of the universe), but the cost to find information... Today the best practice is to publish data and metadata together, in a standard way. There are two good open standards:

PS: for a quick fix... Your link is not showing the "DESCRIPTION??" column, can you fill here the description?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nvkelso/natural-earth-vector/issues/153#issuecomment-238928257, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0EOzijGHlRTz3w99sWCIoORoRCOnPlks5qegB2gaJpZM4GVQm8 .

brunob commented 8 years ago

Maybe a simple csv file describing the different fields in each zip could do the job, or a web page (on the repo wiki or on the official website) linked in every readme files ?

Anyway, @nvkelso no hurry, and thanks a lot for these super useful datasets :)

ppKrauss commented 8 years ago

@brunob yes, a JSON or a CSV file with descriptions will be perfect!


@nvkelso Some obvious names as "abbrev" and "name" make sense, but many others as "geou_dif" and "su_a3" no sense... we need some "translation" ;-)


About goals: hum... this will be other discussion, and perhaps more ideology and personal point of view ... I will try to explain.

Today _Open Data and "not to get bogged down in mind numbing XML metadata"_ are not compatible. Low corruption and transparency needs clarity and good semantic (metadata), the basic standars are the minimal for clarity.

Natural Earth are good and build with good people (!), but the corrupts are hiding in the lack of explanation... Today, in some "more serious use", we are blocked from using Natural Earth because there are not good explanations.

nvkelso commented 8 years ago

Primarily the confusing field names are optional / rarely used. It's a once or twice a year question over millions of downloads, shrug. Send a PR if you feel super strongly about it.

geou_diff = Geo unit is different than its parent administrative unit, true false. This is helpful for labeling maps, and to not repeat labels.

su_a3 = administrative sub-unit Natural Earth alpha 3 character code, like ISO A3 codes, but different.

On Aug 10, 2016, at 11:44, Peter notifications@github.com wrote:

@brunob yes, a JSON or a CSV file with descriptions will be perfect!

@nvkelso Some obvious names as "abbrev" and "name" make sense, but many others as "geou_dif" and "su_a3" no sense... we need some "translation" ;-)

About goals: hum... this will be other discussion, and perhaps more ideology and personal point of view ... I will try to explain.

Today Open Data and "not to get bogged down in mind numbing XML metadata" are not compatible. Low corruption and transparency needs clarity and good semantic (metadata), the basic standars are the minimal for clarity.

Natural Earth are good and build with good people (!), but the corrupts are hiding in the lack of explanation... Today, in some "more serious use", we are blocked from using Natural Earth because there are not good explanations.

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

nvkelso commented 7 years ago

From @jalbertbowden on Apr 10, 2015:

See:

i've pieced together a list of links with some info, and i think labeled 
a few correctly for 110m cultural boundaries, but a) there's alot of gaps 
and b) i'm not 100% on the ones i have.
if you can fill in the gaps, or add to the list, that would be sweet.
trying to format it the data, but its hard without the definitions
https://gist.github.com/jalbertbowden/1a94aa339682eabfdc6a
pronebird commented 5 years ago

Was there any update on this? It's mind boggling that the schema is not annotated and I don't understand what name_zh field is, is it traditional chinese or simplified?

ImreSamu commented 5 years ago

@pronebird

I don't understand what name_zh field is, is it traditional chinese or simplified?

name_zh is a name from wikidata label - and a postfix language code ( _zh) == wikidata language code : https://www.wikidata.org/wiki/Help:Wikimedia_language_codes/lists/all If you need Simplified Chinese you have to import/merge the wikidata zh-hans labels manually ( or "Traditional Chinese" = wikidata zh-hant label )

If you interested in the technical details:

pronebird commented 5 years ago

@ImreSamu thanks for the pointer. Is there any reason why traditional chinese is not included in the data dump by default?

ImreSamu commented 5 years ago

Is there any reason why traditional chinese is not included in the data dump by default?

As I know:

NOW: If you need any other language translation ( like "Traditional Chinese" )
you can do now - with a minimal scripting knowledge:

boris-glumpler commented 2 years ago

Just have a quick question about gid. Can I assume that these values stay the same in between updates and could therefore be used as foreign keys? Cheers!

nvkelso commented 2 years ago

No, use ne_id

boris-glumpler commented 2 years ago

No, use ne_id

Thanks!

vincerubinetti commented 1 year ago

Was about to create a new issue with the content below until I found this one.

One of the stated goals of Natural Earth is not to get bogged down in mind numbing XML metadata

Isn't that... kind of all of it though?The geometry data is pretty mind numbing and tedious too isn't it? Why draw the line here, when so much data is there and work already done.

Don't get me wrong, this whole project is a monumental task that I would never want to do myself, but it seems silly to not even include a best effort attempt at annotating the properties. Or even just putting a link this discussion prominently in the repo or on the website. It took me forever to find anything about this. And as another commenter said, the lack of clarity makes this info (and indeed maybe the whole dataset, to some) not prudent to use.

The issue I was going to post, for extra keywords for other peoples' searches:


Is there documentation or explanation of all the different fields that are in a Feature's properties? There's a lot of different but similar looking values, and their keys are usually short and ambiguous. As a geography/GIS novice, I have no idea which ones I should use, what the variability is, what the source actually is, or even what the full name of the property is (e.g. I assume BRK_NAME is an abbreviation but what's the full name).

example geojson for Italy ```json { "type": "Feature", "properties": { "featurecla": "Admin-0 country", "scalerank": 1, "LABELRANK": 2, "SOVEREIGNT": "Italy", "SOV_A3": "ITA", "ADM0_DIF": 0, "LEVEL": 2, "TYPE": "Sovereign country", "TLC": "1", "ADMIN": "Italy", "ADM0_A3": "ITA", "GEOU_DIF": 0, "GEOUNIT": "Italy", "GU_A3": "ITA", "SU_DIF": 0, "SUBUNIT": "Italy", "SU_A3": "ITA", "BRK_DIFF": 0, "NAME": "Italy", "NAME_LONG": "Italy", "BRK_A3": "ITA", "BRK_NAME": "Italy", "BRK_GROUP": null, "ABBREV": "Italy", "POSTAL": "I", "FORMAL_EN": "Italian Republic", "FORMAL_FR": null, "NAME_CIAWF": "Italy", "NOTE_ADM0": null, "NOTE_BRK": null, "NAME_SORT": "Italy", "NAME_ALT": null, "MAPCOLOR7": 6, "MAPCOLOR8": 7, "MAPCOLOR9": 8, "MAPCOLOR13": 7, "POP_EST": 60297396, "POP_RANK": 16, "POP_YEAR": 2019, "GDP_MD": 2003576, "GDP_YEAR": 2019, "ECONOMY": "1. Developed region: G7", "INCOME_GRP": "1. High income: OECD", "FIPS_10": "IT", "ISO_A2": "IT", "ISO_A2_EH": "IT", "ISO_A3": "ITA", "ISO_A3_EH": "ITA", "ISO_N3": "380", "ISO_N3_EH": "380", "UN_A3": "380", "WB_A2": "IT", "WB_A3": "ITA", "WOE_ID": 23424853, "WOE_ID_EH": 23424853, "WOE_NOTE": "Exact WOE match as country", "ADM0_ISO": "ITA", "ADM0_DIFF": null, "ADM0_TLC": "ITA", "ADM0_A3_US": "ITA", "ADM0_A3_FR": "ITA", "ADM0_A3_RU": "ITA", "ADM0_A3_ES": "ITA", "ADM0_A3_CN": "ITA", "ADM0_A3_TW": "ITA", "ADM0_A3_IN": "ITA", "ADM0_A3_NP": "ITA", "ADM0_A3_PK": "ITA", "ADM0_A3_DE": "ITA", "ADM0_A3_GB": "ITA", "ADM0_A3_BR": "ITA", "ADM0_A3_IL": "ITA", "ADM0_A3_PS": "ITA", "ADM0_A3_SA": "ITA", "ADM0_A3_EG": "ITA", "ADM0_A3_MA": "ITA", "ADM0_A3_PT": "ITA", "ADM0_A3_AR": "ITA", "ADM0_A3_JP": "ITA", "ADM0_A3_KO": "ITA", "ADM0_A3_VN": "ITA", "ADM0_A3_TR": "ITA", "ADM0_A3_ID": "ITA", "ADM0_A3_PL": "ITA", "ADM0_A3_GR": "ITA", "ADM0_A3_IT": "ITA", "ADM0_A3_NL": "ITA", "ADM0_A3_SE": "ITA", "ADM0_A3_BD": "ITA", "ADM0_A3_UA": "ITA", "ADM0_A3_UN": -99, "ADM0_A3_WB": -99, "CONTINENT": "Europe", "REGION_UN": "Europe", "SUBREGION": "Southern Europe", "REGION_WB": "Europe & Central Asia", "NAME_LEN": 5, "LONG_LEN": 5, "ABBREV_LEN": 5, "TINY": -99, "HOMEPART": 1, "MIN_ZOOM": 0, "MIN_LABEL": 2, "MAX_LABEL": 7, "LABEL_X": 11.076907, "LABEL_Y": 44.732482, "NE_ID": 1159320919, "WIKIDATAID": "Q38", "NAME_AR": "إيطاليا", "NAME_BN": "ইতালি", "NAME_DE": "Italien", "NAME_EN": "Italy", "NAME_ES": "Italia", "NAME_FA": "ایتالیا", "NAME_FR": "Italie", "NAME_EL": "Ιταλία", "NAME_HE": "איטליה", "NAME_HI": "इटली", "NAME_HU": "Olaszország", "NAME_ID": "Italia", "NAME_IT": "Italia", "NAME_JA": "イタリア", "NAME_KO": "이탈리아", "NAME_NL": "Italië", "NAME_PL": "Włochy", "NAME_PT": "Itália", "NAME_RU": "Италия", "NAME_SV": "Italien", "NAME_TR": "İtalya", "NAME_UK": "Італія", "NAME_UR": "اطالیہ", "NAME_VI": "Ý", "NAME_ZH": "意大利", "NAME_ZHT": "義大利", "FCLASS_ISO": "Admin-0 country", "TLC_DIFF": null, "FCLASS_TLC": "Admin-0 country", "FCLASS_US": null, "FCLASS_FR": null, "FCLASS_RU": null, "FCLASS_ES": null, "FCLASS_CN": null, "FCLASS_TW": null, "FCLASS_IN": null, "FCLASS_NP": null, "FCLASS_PK": null, "FCLASS_DE": null, "FCLASS_GB": null, "FCLASS_BR": null, "FCLASS_IL": null, "FCLASS_PS": null, "FCLASS_SA": null, "FCLASS_EG": null, "FCLASS_MA": null, "FCLASS_PT": null, "FCLASS_AR": null, "FCLASS_JP": null, "FCLASS_KO": null, "FCLASS_VN": null, "FCLASS_TR": null, "FCLASS_ID": null, "FCLASS_PL": null, "FCLASS_GR": null, "FCLASS_IT": null, "FCLASS_NL": null, "FCLASS_SE": null, "FCLASS_BD": null, "FCLASS_UA": null }, "bbox": [ 6.749955, 36.619987, 18.480247, 47.115393 ], "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ 10.442701, 46.893546 ], [ 11.048556, 46.751359 ], [ 11.164828, 46.941579 ], [ 12.153088, 47.115393 ], [ 12.376485, 46.767559 ], [ 13.806475, 46.509306 ], [ 13.69811, 46.016778 ], [ 13.93763, 45.591016 ], [ 13.141606, 45.736692 ], [ 12.328581, 45.381778 ], [ 12.383875, 44.885374 ], [ 12.261453, 44.600482 ], [ 12.589237, 44.091366 ], [ 13.526906, 43.587727 ], [ 14.029821, 42.761008 ], [ 15.14257, 41.95514 ], [ 15.926191, 41.961315 ], [ 16.169897, 41.740295 ], [ 15.889346, 41.541082 ], [ 16.785002, 41.179606 ], [ 17.519169, 40.877143 ], [ 18.376687, 40.355625 ], [ 18.480247, 40.168866 ], [ 18.293385, 39.810774 ], [ 17.73838, 40.277671 ], [ 16.869596, 40.442235 ], [ 16.448743, 39.795401 ], [ 17.17149, 39.4247 ], [ 17.052841, 38.902871 ], [ 16.635088, 38.843572 ], [ 16.100961, 37.985899 ], [ 15.684087, 37.908849 ], [ 15.687963, 38.214593 ], [ 15.891981, 38.750942 ], [ 16.109332, 38.964547 ], [ 15.718814, 39.544072 ], [ 15.413613, 40.048357 ], [ 14.998496, 40.172949 ], [ 14.703268, 40.60455 ], [ 14.060672, 40.786348 ], [ 13.627985, 41.188287 ], [ 12.888082, 41.25309 ], [ 12.106683, 41.704535 ], [ 11.191906, 42.355425 ], [ 10.511948, 42.931463 ], [ 10.200029, 43.920007 ], [ 9.702488, 44.036279 ], [ 8.888946, 44.366336 ], [ 8.428561, 44.231228 ], [ 7.850767, 43.767148 ], [ 7.435185, 43.693845 ], [ 7.549596, 44.127901 ], [ 7.007562, 44.254767 ], [ 6.749955, 45.028518 ], [ 7.096652, 45.333099 ], [ 6.802355, 45.70858 ], [ 6.843593, 45.991147 ], [ 7.273851, 45.776948 ], [ 7.755992, 45.82449 ], [ 8.31663, 46.163642 ], [ 8.489952, 46.005151 ], [ 8.966306, 46.036932 ], [ 9.182882, 46.440215 ], [ 9.922837, 46.314899 ], [ 10.363378, 46.483571 ], [ 10.442701, 46.893546 ] ] ], [ [ [ 14.761249, 38.143874 ], [ 15.520376, 38.231155 ], [ 15.160243, 37.444046 ], [ 15.309898, 37.134219 ], [ 15.099988, 36.619987 ], [ 14.335229, 36.996631 ], [ 13.826733, 37.104531 ], [ 12.431004, 37.61295 ], [ 12.570944, 38.126381 ], [ 13.741156, 38.034966 ], [ 14.761249, 38.143874 ] ] ], [ [ [ 8.709991, 40.899984 ], [ 9.210012, 41.209991 ], [ 9.809975, 40.500009 ], [ 9.669519, 39.177376 ], [ 9.214818, 39.240473 ], [ 8.806936, 38.906618 ], [ 8.428302, 39.171847 ], [ 8.388253, 40.378311 ], [ 8.159998, 40.950007 ], [ 8.709991, 40.899984 ] ] ] ] } } ```

For example, what's the difference between ISO_A3 and ISO_A3_EH? I can't find any info on google for "iso a3 eh". What is GEOUNIT? BRK_NAME? FCLASS? WOE?

Would be great to have at least just a single sentence for each field here (or one for each group of related fields e.g. ADM0_A3_EN, ADM0_A3_IT, ADM0_A3_JP).

SpocWeb commented 1 year ago

I volunteer to add a metadata table (3 Columns: abbreviation, type and description) to the corresponding HTML Pages (like e.g. https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-populated-places/) if @nvkelso accepts such Pull Requests.

nvkelso commented 1 year ago

Yes, PRs accepted.

brycied00d commented 1 year ago

For example, what's the difference between ISO_A3 and ISO_A3_EH?

I too would love to know what the "_EH" suffix is used for, and thus the difference between iso_a2 and iso_a2_eh fields. (Why is France's iso_a2=-99 while its iso_a2_eh is FR?)

nvkelso commented 1 year ago

Per Urban Dictionary:

Eh – Meaning confused - or used at the end of a sentence to show it is a question (Canadian English) or "approximately right" (California English)

Natural Earth and ISO have slightly different understandings of the world... and some NE admin 0 "levels" (eg country versus map unit) match more exactly with ISOs versus approximately.

philipshirk commented 7 months ago

I volunteer to add a metadata table (3 Columns: abbreviation, type and description) to the corresponding HTML Pages (like e.g. https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-populated-places/) if @nvkelso accepts such Pull Requests.

Any update on this? Is metadata available, yet?

SpocWeb commented 7 months ago

@philipshirk sorry, no time and too little knowledge yet. I am only an amateur.

applegould commented 7 months ago

I have been looking for explanations of the attribute table column headers for a while now and haven't been able to find anything. It's very frustrating to not have documentation for naming conventions to make the best choice on analyzing information and displaying results. Is there a reason the READMEs take us back to the download page? If there is a working document for Natural Earth Metadata, where is it?

kraktus commented 6 months ago

There has been a start in https://github.com/nvkelso/natural-earth-vector/pull/861.

DigitalNaut commented 6 months ago

I have a few questions:

This is a bit of a struggle 😅 Many thanks.

Edit: I think I got part of my answer for the third question by looking at the map units data tables. Sovereign country has a tentative control over other independent countries, like China -> Tawian. I'm still not sure what Sovereignty means on its own for Cuba & Kazakhstan, though.