Closed mledoze closed 1 year ago
It might be useful to provide the country name in the native language of the country itself (e.g. {"name": "Germany", "name_native": "Deutschland"}
...
The CLDR database of the unicode project contains Country-To-Language data, including the percent of speakers: http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html
It might be useful to provide the country name in the native language of the country itself
The native name of Germany is already in 'alt-spellings'. I recognize that the name 'alt-spellings' isn't good since it contains alternative spellings and the native name of the country. So there are two solutions here:
Initially, I created this dataset with a country selector in mind [1] but it would make more sense to be able to get the native names separately. So I would choose the second option.
But the second option raises the question of how to write the native name of the country. German uses latin characters so it's easy to know that it's Germany, but what about Armenia for example which is written Հայաստան in armenian [2]? For some people it might be difficult to know that it's Armenia.
What do you think?
I know that alternative spellings and native names are missing for many countries, I'm currently working on adding them. Also, I'll add the native/official language(s) of each country.
[1] https://github.com/JamieAppleseed/selectToAutocomplete [2] http://en.wikipedia.org/wiki/Armenia
Not all people speak English, so they might be confused while selecting their locale. It might be useful if it is possible to see the English and native version of the country name parallel in the selector.
I would recommend to provide both versions for different individual usecases.
Right, it's valid for non english speakers.
If you want, feel free to start working on adding the native names as I'll be off for a few days.
I think it would be great to have a way to make Countries Hierarchical and have meta data describing whether they are countries or sovereign states.
For the UK currently it says "alt-spellings":"GB,Great Britain,England,UK,Wales,Scotland,Northern Ireland".
The full name of the UK is "The United Kingdom of Great Britain and Northern Ireland". It is not a country, it is a sovereign state.
Great Britain also isn't a country, it's an island.
There are three countries in Great Britain: England, Scotland and Wales.
So the types I think needed are: Country, State, Sovereign State and potentially Nation and Union as well.
Then it would be good to have a way to specify that England is within the UK and if you also have unions that it is within the EU.
Another nice feature would be to list what land borders a country has. So you could specify that England borders Scotland and Wales for example.
From https://github.com/ProGNOMmers
It would be wonderful if it would be possible to retrieve regions, provinces and cities.
Something like:
// Regions of country
// /rest/alpha2/it/regions ->
{ regions: [ "Abruzzi e Molise",
"Basilicata",
"Calabria",
"Campania",
"Emilia-Romagna",
"Friuli-Venezia Giulia",
"Lazio",
"Liguria",
"Lombardia",
"Marche",
"Piemonte",
"Puglia",
"Sardegna",
"Sicilia",
"Toscana",
"Trentino-Alto Adige",
"Umbria",
"Valle d'Aosta",
"Veneto" ] }
// Provinces of region
// /rest/alpha2/it/regions/Veneto/provinces ->
{ provinces: [ "Verona", "Venezia", ... ] }
// Cities of province
// /rest/alpha2/it/regions/Veneto/provinces/Venezia/cities ->
{ cities: [ { name: "Venezia", zip_codes: [ "30121", ... , "30176" ] },
{ name: "Chioggia", zip_codes: [ "30015" ] },
{ name: "San Donà di Piave", zip_codes: [ "30027" ] },
... ] }
// Cities of country by name
// /rest/alpha2/it/regions/Veneto/provinces/Venezia/cities ->
{ cities: [ { name: "Venezia", zip_codes: [ "30121", ... , "30176" ] },
{ name: "Chioggia", zip_codes: [ "30015" ] },
{ name: "San Donà di Piave", zip_codes: [ "30027" ] },
... ] }
Cities could have metadata like f.i. zip codes, which are very useful.
It is a huge work because recording and maintaining the whole list of regions, provinces and cities for every world country is hard, but it is a good target to be accomplished by an open source project.
@stephenpaulger
I think it would be great to have a way to make Countries Hierarchical and have meta data describing whether they are countries or sovereign states.
I agree, I'll add this to the todo. I know that many entries in the dataset are not actual contries. I wanted to provide simple and factual data about world countries but I understand that more accuracy is needed.
@fayder
It would be wonderful if it would be possible to retrieve regions, provinces and cities.
Yes it is a huge work. First I want to continue to add more data at the country level (native and official names, official language, etc.) and add the master file as soon as possible (#12) to ease the contributions.
Thank you for your help/feedback, I appreciate it!
For the UK currently it says "alt-spellings":"GB,Great Britain,England,UK,Wales,Scotland,Northern Ireland".
@stephenpaulger in bd22b4a97f30ead3ae55f68d2c3e9b86ba784ba7 I have removed most of the names in altSpellings
, now it's just GB,UK,Great Britain
.
We can also add time zone data from http://timezonedb.com/download.
It would be really nice if there would be also a list of states per country such as the United States states. http://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States
@shanti2530 yes, this has been suggested https://github.com/mledoze/countries/issues/6#issuecomment-27620009 but it has not been done yet because the work is pretty huge. Do you know a source where we can find the states for every country?
@mledoze don't know if this is what you were looking for http://vikku.info/programming/geodata/geonames-get-country-state-city-hierarchy.htm
@shanti2530 this seems very good, thank you. I'll create an issue for this. Would you like to work on this?
GeoJSON outlines of the countries: https://github.com/datasets/geo-boundaries-world-110m
@gerbenjacobs yes good idea, I'll add this to the to-do
I agree for the gerbenjacobs idea of GeoJSON outlines of the countries
@mledoze don't know if it's in the scope of this project, but I would love to see financial information like GDP, GDP per capita, GNI etc. - problem with this is of course that these numbers would change every year.
@matiassingers no it's not really in the scope of this project. I prefer to stick with static data that do not change. The dataset currently contains population data which are not in the scope and I would like to remove it in the near future.
Although it does not currently contains GDP data, you should check this project https://github.com/tinata/tinatapi which contains other financial data.
@dalu the postal prefixes is a good idea!
@dalu you are saying that postal services want the native country name instead of the country postal prefix?
I would like to inform you that I am about to remove population data because they require frequent updates to stay relevant.
I recently added CONTRIBUTING explaining the contributions rules of this project. Population data do not follow these instructions.
acknowledged
How about the address format, from the page mentioned above: http://en.wikipedia.org/wiki/Address_(geography)
This may be fairly difficult to do as it requires some pseudo templating language, so say for US: "addressFormat": "{{name}}\n{{houseNumber}} {{street}}\n{{locality}}\n{{city}}\n{{postalCode}}"
And would need agreement on the labels used...
Hi, nice project! Thanks.
Something that would be useful to me is to know if a country is in the European Union. (https://en.wikipedia.org/wiki/Member_state_of_the_European_Union)
This information is needed when you are a company in the EU dealing with international customers. If you charge VAT or not depends on whether your customer is in the EU or not.
If you are interested in including this information I could setup a pull-request
@tdegrunt yes it is indeed a difficult task to do, but @hexorx managed to do it in his countries
repository: https://github.com/hexorx/countries/blob/master/lib/data/countries.yaml
@0x01 yes I'm interested in including this information. Could you please add it as extra data in the data
folder using [cca3].json
file names?
Thank you!
@mledoze: I can do that (in data
folder), but I think it makes more sense to put it into the main file. There is very little data added, basically a boolean whether or not it's an EU member state (and I'll leave out the field if it's false)
@0x01 you are right that this represent little data but it would be useful for only 10% of the countries (26 member state of the EU out of 251 "countries" in the dataset).
Moreover, the EU is categorized as a supranational union and it exists many other unions in the world (see [1]), so as not to add many booleans in the main file, I prefer to add this data in separate files.
[1] http://en.wikipedia.org/wiki/Political_union#Supranational_and_continental_unions
Hm, true. I redraw my offer for a pull request, as this EU membership is much more subtle indeed. Should I ever need to sort this out properly I'll come back with a pull request, but for now a simple list of country names is enough to get some rough indication. Which is good enough for my purposes. For example, this code in node does the trick
var rawCountries = require('countries.json');
var EU = [
"Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus",
"Czech Republic", "Denmark", "Estonia", "Finland", "France", "Germany",
"Greece", "Hungary", "Ireland", "Italy", "Latvia", "Lithuania",
"Luxembourg", "Malta", "Netherlands", "Poland", "Portugal",
"Romania", "Slovakia", "Slovenia", "Spain", "Sweden", "United Kingdom"
];
var countries = rawCountries.map(function(country){
return { // for example
code: country.cca3,
name: country.name
eu_member: _(EU).contains(country.name)
}
});
I'd add coastline length (CIA factbook field 2060).
How about population??
@herrniemand population data was already added to this dataset (https://github.com/mledoze/countries/commit/81fa9f68215d92fba2a850c272d019f539cf30ad) but later removed (see https://github.com/mledoze/countries/issues/6#issuecomment-42322804).
I've seen you were discussing official names, and I got an idea:
{
"name": {
"common": "Afghanistan",
"native": "\u0627\u0641\u063a\u0627\u0646\u0633\u062a\u0627\u0646",
"official": "Islamic Republic of Afghanistan"
}
}
otherwise there to many "name%insert type here%"
@herrniemand this is a very good idea. I also want to add the official name in its native language, so we could have something like this:
{
"name": {
"common": "Afghanistan",
"official": "Islamic Republic of Afghanistan",
"native": {
"common" : "\u0627\u0641\u063a\u0627\u0646\u0633\u062a\u0627\u0646",
"official": "\u062f \u0627\u0641\u063a\u0627\u0646\u0633\u062a\u0627\u0646 \u0627\u0633\u0644\u0627\u0645\u064a \u062c\u0645\u0647\u0648\u0631\u06cc\u062a"
}
}
}
What do you think?
@mledoze Yea. Awesome.
@mledoze how about translations? I was thinking something like:
{
"name": {
"common": "Afghanistan",
"official": "Islamic Republic of Afghanistan",
"native": {
"common" : "\u0627\u0641\u063a\u0627\u0646\u0633\u062a\u0627\u0646",
"official": "\u062f \u0627\u0641\u063a\u0627\u0646\u0633\u062a\u0627\u0646 \u0627\u0633\u0644\u0627\u0645\u064a \u062c\u0645\u0647\u0648\u0631\u06cc\u062a"
},
"translations":{
"ru":...,
"de":...
}
}
}
or should we keep as it is?
@herrniemand I prefer to keep the translations as it is for now.
@mledoze ok.
@mledoze. I've just stuck on a problem with native names. Some countries like Afghanistan and Åland Islands have more then one official language, so what do we define as native
?
@herrniemand for countries with more than one official language, you should use the language that is listed first in the language
property. So for Afghanistan and Åland Islands, it is Pashto and Swedish respectively.
@mledoze ok. Thanks.
@mledoze Question about translations. What is the list of the languages we want country names would be translated? I suggest UN(7) official + first 15 from most speaking: https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers
@mledoze Another thing. Since we are changed structure of the language block, wouldn't it be correct to change translations to the same style:
"translations":{
"de": {
"common":"Russland",
"official":"Russische Föderation"
}...
}
?
Not exactly contributing to this issue, the feature list or the roadmap, but I simply wanted to express my deepest thanks for putting this awesome list up on the web. I've been searching for it for over 2 years, and a simple though epicly effective Google search turned this page up.
Thank you for your amazing for and this amazing list of countries!
@ReSpawN you're very welcome, thank you for your comment, I really appreciate it! This work would not be as it is now without the help of all the contributors.
If you do something with this dataset, don't hesitate to add it the the showcase list in the readme.
A variable saying if the country is a landlocked country or not http://en.wikipedia.org/wiki/Landlocked_country
+1 for @romsson's idea.
+1 @romsson's idea *)
I would like to discuss here the data that should be added to this repository.
A similar project like 0xJS [1] contains a lot more data such as the land area or the latitude/longitude coordinates of each country.
Is it interesting/useful to have this kind of data too?
Data that can be added:
What would you like to be added?
Please let me know in the comments.
[1] http://oxjs.org/#doc/Ox.COUNTRIES [2] source: http://opengeocode.org/ [3] source: https://oxjs.org/#doc/Ox.COUNTRIES
From the comments