jku-icg-classroom / va-project-2020-group-n-1

va-project-2020-group-n-1 created by GitHub Classroom
0 stars 0 forks source link

Which dataset should we choose? #1

Open hs222 opened 3 years ago

hs222 commented 3 years ago

I would suggest something from http://www.gapminder.org/data/ or https://ec.europa.eu/eurostat/data/database.

johannes-kroepfl-97 commented 3 years ago

I think we've overlooked the deadline :/ Nevertheless, I did some research for datasets and think we should choose multiple ones since one dataset alone would result in too low dataset complexity. The following combination of datasets sounds interesting. They are all downloadable as CSV files (easy to handle in pandas) and seem to be complete/ easy to handle. Go to http://www.gapminder.org/data/ and "List of Indicators" navigate to: 1) Economy - Incomes and growth - GDP total, yearly growth 2) Education - Gender equality - Gender ratio of mean years in school 3) Environment - Emission - CO2 emissions per person 4) Babies per woman (total fertility)

I think this combination would make some interesting observations and the datasets are all sorted by country and year. We could also drop one of those datasets. Please have a look at them and give me some feedback. Regards, Johannes

vparonov commented 3 years ago

Your proposal makes sense, so you have my vote for it. I would suggest creating a chat in WhatsApp or some of the other popular chat apps and to continue the discussion there.

vparonov commented 3 years ago

I downloaded the datasets and put them into the data folder

hs222 commented 3 years ago

Good choice. Thx for the download. I also think WhatsApp is probably better for further discussion.

johannes-kroepfl-97 commented 3 years ago

Good idea! Here is the invitation link to the group: https://chat.whatsapp.com/JttVjD3dCbPGM71c5ioIuY

Agent-Ace commented 3 years ago

Hi, sorry for joining late (didn't see that new issue) and thanks for proposing/choosing the datasets.