InternetHealthReport / internet-yellow-pages

A knowledge graph for the Internet
https://iyp.iijlab.net
GNU General Public License v3.0
43 stars 18 forks source link

Add country population from World Bank #129

Closed JustinLoye closed 8 months ago

JustinLoye commented 8 months ago

Explain the dataset you want to add and how it would contribute to the Internet Yellow Pages. It could be useful to have the countries' population. For example the ranking ihr.country_dependency gives the relative population coverage of eyeball networks per country, but does not indicate the actual population value.

World Bank seems to be the authoritative data source.

Provide the name of the organization providing the data and the url to the dataset

If possible describe how you would like to model the dataset in the Yellow Pages Since there is only one value from one dataset per country, I suggest adding it as a node property

m-appel commented 8 months ago

Although it is intuitive to want to add it as a node property, I would propose we add it as a POPULATION relationship to some node instead.

The main reason is that we could not track the source of the data like we normally do (the reference_* properties only exist for relationships). This would be fine if the population would be a de-facto value (like the country name we add in country_information), but it is not.

Another reason would be that if we ever wanted to add a second source for population estimates (I don't think we will, but oh well...), we could not do so.

The only question is how we label the node we connect the relationship with. I'd propose to be generic and just call it Estimate, with a name property. We could be more specific, like PopulationEstimate, but I don't see a reason for this, since we have the POPULATION already in the relationship. And we could reuse Estimate for other things in the future. Opinions @romain-fontugne?

(:Country)-[:POPULATION {value: 123}]->(:Estimate {name: 'World Bank Population Estimate')