mledoze / countries

World countries in JSON, CSV, XML and Yaml. Any help is welcome!
https://mledoze.github.io/countries/
Open Data Commons Open Database License v1.0
5.97k stars 1.27k forks source link

Add the country states/regions #40

Closed mledoze closed 7 years ago

mledoze commented 10 years ago

The idea is to add the country states or regions (depending on the country).

A format was proposed here: https://github.com/mledoze/countries/issues/6#issuecomment-27620009

@shanti2530 suggets to import the data from: http://vikku.info/programming/geodata/geonames-get-country-state-city-hierarchy.htm

I propose not to add cities because it would make the dataset too big in my opinion. I prefer to focus on top-level data.

mledoze commented 10 years ago

I've been thinking about this and I have mixed feelings because it will take us away from the goal of this project, which is to provide top-level data about world countries.

What do you think? Is there really a need for this?

shanti2530 commented 10 years ago

My biggest concern with the addition of the states is that the file will become very large.

What about having separate files for each country with additional details such as its states?

geraldb commented 10 years ago

FYI: You can get states (regions) for countries at the openmundi/world.db site. All data is public domain use as you please. Cheers.

mledoze commented 10 years ago

My biggest concern with the addition of the states is that the file will become very large.

That is my concern too. I agree to use separate files.

@geraldb thank you for the link.

wiredmax commented 10 years ago

I would like to contribute on this, if we separate the files, what structure are we looking for and shall we use cca2 or cca3 for naming the files?

What about the folder structure? Here's few ideas:

I have a preference for /data/[cc].regions.json, what do you think?

We need also to be careful on how to name this... it's a bit tricky. The definition and hierarchy changes from one place to the other (states / provinces / counties / municipalities / parishes / districts / regions)

Canada: Country -> Province or Territory -> Region

Italy: Country -> Region -> Province

New-Zealand: Country -> Region

United-States: Country -> State -> County

mledoze commented 10 years ago

I propose to use cca3 to name these files. Right now the files in data use the alpha-2 country code but I'm going to change that today to be coherent with the current data in countries.json (like the list of land borders).

Regarding the name pattern, I prefer to use /data/[cc].json to limit the number of files in the data folder. These files would contain specific data for each country like the list of regions, for example.

Thank you for your initiative, your contributions are very welcome!

wiredmax commented 10 years ago

What do you think about the format in the following examples for France region of Île-de-France and Canada for Québec) This schema would allow two levels of regional information and could be very easily expended into providing population data, languages, and more for each level of regions.

Very glad to contribute :)

/data/can.json

{
  name: "Quebec"
  nativeName: "Québec"
  type: "Province"
  divisions: [
      { type: "region", name: 'Bas-Saint-Laurent' },
      { type: "region", name: 'Saguenay–Lac-Saint-Jean' },
      { type: "region", name: 'Capitale-Nationale' },
      { type: "region", name: 'Mauricie' },
      { type: "region", name: 'Estrie' },
      { type: "region", name: 'Montréal' },
      { type: "region", name: 'Outaouais' },
      { type: "region", name: 'Abitibi-Témiscamingue' },
      { type: "region", name: 'Côte-Nord' },
      { type: "region", name: 'Nord-du-Québec' },
      { type: "region", name: 'Gaspésie–Îles-de-la-Madeleine' },
      { type: "region", name: 'Chaudière-Appalaches' },
      { type: "region", name: 'Laval' },
      { type: "region", name: 'Lanaudière' },
      { type: "region", name: 'Laurentides' },
      { type: "region", name: 'Montérégie' },
      { type: "region", name: 'Centre-du-Québec' }
  ]
},
{
  name: "Ontario",
...

/data/fra.json

{
  name: "Ile-de-France",
  nativeName:  "Île-de-France",
  type: "Region",
  divisions: [
      { type: "Department", name: 'Paris' },
      { type: "Department", name: 'Seine-et-Marne' },
      { type: "Department", name: 'Essonne' },
      { type: "Department", name: 'Hauts-de-Seine' },
      { type: "Department", name: 'Seine-Saint-Denis' },
      { type: "Department", name: 'Val-de-Marne' },
      { type: "Department", name: 'Val-d'Oise' }
  ]
},
{
  name: "Centre",
...
mledoze commented 10 years ago

The example is good! Can you please use lowercase strings for type?

mledoze commented 9 years ago

@wiredmax Hi, have you had any time to continue on this? Is there anything I can do to help?

wiredmax commented 9 years ago

I'm quite busy with a lot of stuff at work, but it's still a need we have in many projects. @msabeh works with me and she should be available to expend this further in early january.

She has already done a great job rencently with: https://github.com/wiredmax/world-currencies

shehi commented 9 years ago

So how can we wrap-up this proposal and get on working? :)

mledoze commented 9 years ago

@wiredmax any updates on this one? If you don't have time to finish this, it's not a problem, someone can take over.

wiredmax commented 9 years ago

@msabeh and I will start working on this. Starting today, we will fork it and structure the data this way if you still agree:

The only only change with the previously disccussed in this issue is the lower case type.

/data/can.json5

{
  name: "Quebec"
  nativeName: "Québec"
  type: "province"
  divisions: [
      { type: "region", name: 'Bas-Saint-Laurent' },
      { type: "region", name: 'Saguenay–Lac-Saint-Jean' },
      { type: "region", name: 'Capitale-Nationale' },
      { type: "region", name: 'Mauricie' },
      { type: "region", name: 'Estrie' },
      { type: "region", name: 'Montréal' },
      { type: "region", name: 'Outaouais' },
      { type: "region", name: 'Abitibi-Témiscamingue' },
      { type: "region", name: 'Côte-Nord' },
      { type: "region", name: 'Nord-du-Québec' },
      { type: "region", name: 'Gaspésie–Îles-de-la-Madeleine' },
      { type: "region", name: 'Chaudière-Appalaches' },
      { type: "region", name: 'Laval' },
      { type: "region", name: 'Lanaudière' },
      { type: "region", name: 'Laurentides' },
      { type: "region", name: 'Montérégie' },
      { type: "region", name: 'Centre-du-Québec' }
  ]
},
{
  name: "Ontario",
...

We'll start with the easy ones (from our perspective) North America and Europe, than South and Central America, Africa (starting with Magreb and Northern africa) then moving to Middle-East and Asia and then finish with all the special cases.

mledoze commented 9 years ago

@wiredmax I'm glad to hear that you are working on this. I still agree with the format. I think that we should use the same format for the name property. So in this case:

"name": {
    "common": "Quebec",
    "native": {
        "fra": {
            "common": "Québec"
        }
    }
}

It's a bit verbose, but it's coherent with the data in countries.json.

What do you think?

mathieumg commented 9 years ago

@wiredmax I imagine you meant to use double-quotes for the names?

@mledoze I agree, while it's more verbose it's more thorough, more flexible and (as you pointed out) consistent with the rest.

wiredmax commented 9 years ago

@mathieumg Yes they are double quoted since it's reglular JSON and not JSON5.

wiredmax commented 9 years ago

@mledoze For the name I agree to, but shall we do it only at the first level or both? Doing it on both level seems a bit heavy for me since to me the native name is the one always used.

{
  "name": {
    "common": "Alberta",
    "native": {
      "eng": {
            "common": "Alberta"
    }
  },
  "type": "province"
  "divisions": [
      { "type": "district", "name": "Acadia" },
      { "type": "district", "name": "Athabasca" },
...

vs

{
  "name": {
    "common": "Alberta",
    "native": {
      "eng": {
            "common": "Alberta"
    }
  },
  "type": "province"
  "divisions": [
      { 
         "type": "district",
          "name": {
             "common": "Acadia",
             "native": {
               "eng": "Acadia"
             }
           }
       },
      { 
         "type": "district",
          "name": {
             "common": "Athabasca",
             "native": {
               "eng": "Athabasca"
             }
           }
       },
...
mledoze commented 9 years ago

@wiredmax yes doint it on both levels is too heavy.

This solution

{
  "name": {
    "common": "Alberta",
    "native": {
      "eng": {
            "common": "Alberta"
    }
  },
  "type": "province",
  "divisions": [
      { "type": "district", "name": "Acadia" },
      { "type": "district", "name": "Athabasca" },

is better :+1:

mathieumg commented 9 years ago

Yes they are double quoted since it's reglular JSON and not JSON5.

But you keep using single quotes in your examples. :wink:

wiredmax commented 9 years ago

All good then @msabeh will start working on the fork, starting with North America, once done I'll submit a PR that we could merge in a regions branch onto your project and see if there's needed ajustements and then continue further with Europe. Once all the regions of the word are 98% (2% we are talking about the weird case we will probably find) you could then merge onto master.

@mathieumg I know, I didn't notice them, I've fixed it in my last comment, no worries. Anyway @msabeh atom will notice her of single quotes.

msabeh commented 9 years ago

Yay! I'll be working on this this afternoon, I'll keep you guys posted.

justqyx commented 7 years ago

Hi all, thanks for the job all of you guys have been done. I'd like to ask there is a considerable progress about this issue?

mledoze commented 7 years ago

@justqyx hello, unfortunately I think this PR is dead, would you like to work on it?

mledoze commented 7 years ago

I'll close this issue for the moment. If someone want to work on it, feel free to open a new issue.

adrienne commented 5 years ago

Bookmarking this as a reminder to myself because i might be able to help. @adrienne

mledoze commented 5 years ago

@adrienne your help is welcome! :slightly_smiling_face: