datasets / awesome-data

Curated list of quality open datasets
https://datahub.io/collections
764 stars 94 forks source link

[super] Population Datasets #167

Open rufuspollock opened 8 years ago

rufuspollock commented 8 years ago

This is a "super" issue for population datasets.

zelima commented 8 years ago

@rgrp

zelima commented 8 years ago

@gsilvapt Could you help with this one?

gsilvapt commented 8 years ago

@zelima

Yes, I can. The idea is to divide the data sets i 1) male/female set; 2) from the different age bands you have there, correct?

Notice: I will be out of town until Saturday, so this might take some extra time.

zelima commented 8 years ago

@gsilvapt that's great! There's no need of rush, you can start packaging when you have time for it. :)

gsilvapt commented 8 years ago

@rgrp and @zelima

If we are following the structure proposed above in this comment, would not it make sense to merge all data sets into a single set called "Population"? We can explain the division in the README file.

zelima commented 8 years ago

@gsilvapt That could be an option. What's your thoughts about this @rgrp?

As a starting point, we could start packaging them as several small ones and if there's need we will merge after.

rufuspollock commented 8 years ago

@zelima @gsilvapt this is a "super" issue so generally we would not generate a data package to address this issue but would create data packages for all the subissues it references.

In that spirit, @zelima you should create a separate issue for the "population broken down" (as per info in your comment above). And then link that from the main description for this issue.

In general, I prefer the "small is beautiful" approach to creating data packages - rather than one massive data package several smaller ones. That said, there are times where data should clearly go in one whole data package so there is an aspect of judgment.

To make any decision here we'd need to clearly lay out what we wanted to merge together.

zelima commented 8 years ago

@rgrp Think we can come up with two datapackages here:

year,age,percentage
1960,0-15,30
1960,15-65,50
1960,65-above,20
1961,0-15,35
...
gsilvapt commented 8 years ago

@rgrp The items to merge would be the total population, country population (divided by male/female and age bands) and city population, as a single data set. However, your point about "supper" makes sense and it would be easy to leave things as they are right now and just add another set with the structure of @zelima's last comment.

rufuspollock commented 7 years ago

@zelima can we edit the main description to update with the population-by-xxx items - and sources where they would come from.