jsoma / data-studio-projects

12 stars 18 forks source link

Syrian refugees around the world since civil war began #156

Open mattrehbein opened 6 years ago

mattrehbein commented 6 years ago

Pitch

Summary

I want to explore where Syrian refugees have gone (country-wise) since the country's civil war began in 2011. Since I have data since that period, I'd also like to see whether there's anything interesting about how the total number of Syrian refugees has changed year to year. My data comes from UNHRC.

Details

Possible headline(s): Where Syrian Refugees Go

Data set(s): http://popstats.unhcr.org/en/persons_of_concern

Code repository: https://github.com/mattrehbein/data_studio/tree/master/code/02-project

Possible problems/fears/questions:

Work so far

I've found my data set on unhrc, but am having a hard time getting pandas to accept the csv I downloaded, so work thus far is pretty limited. But here's what I'd like to end up with, but focused on Syria specifically, rather than all refugees worldwide.

image image

Checklist

mattrehbein commented 6 years ago

Update

To add another element (and data set) to the project, I got GDP data from the World Bank, in order to see how it relates to Syrian refugee hosting (how are the world's richest nations responding to one of the world's biggest and most publicized crises).

I plan on seeing how a scatter plot looks when comparing refugee levels and gdp, but I don't have much to show graph-wise at the moment. I keep getting an "empty dataframe" error when trying to plot that I haven't been able to fix yet.

Here's what I'm aiming for as of now: -which countries host the most Syrian refugees and their average gdp -how many refugees do wealthy nations host (think I'll cut it off at top 10 or 20 gdp) -what the US is doing

The last point there might be a bit of work, only because the UN's numbers on how many Syrian refugees are in US appear to be a good bit different than I'm seeing in other news reports that cite State Dept. data. I'll go after that data as time permits.

Any changes in direction or topic?

Including GDP information.

Problems/Questions

As ever, cleaning even the simplest data is very time-consuming with lots of trouble shooting. But the biggest problem right now is my empty df error with matplotlib.

Checklist

playfairbot commented 6 years ago

Hi! I'm a little robot, here for a surprise inspection.

You need some feedback, let me summon @pasiegrist, @kidaemon, @SimoneLuc for you

It looks like we need to fix up your your update a little bit! Edit it by clicking the pencil in the top right-hand corner. It requires:

Maybe you just didn't use the template? If not, edit your comment, cut and paste the template in, and then fill it out.

pasiegrist commented 6 years ago

Hey Matt

I really like your approach of adding additional data to the analysis. To use GDP makes sense to me. Maybe you could also do the math for syrian refugees per population? Or do a ratio of it to account for population and economic-size?

As you have not drawn up some graphs yet, there is no critique there :) But I think going with bar charts should be save and sound. Good luck with your data struggle. If you want any help, feel free to walk up.

mattrehbein commented 6 years ago

Update

Your project content: images/words/etc

Got a few basic graphs made. I've joined gdp onto my refugee data set, but there are a lot of discrepancies in the country names between the two df's, so I'm still working on cleaning that up. The biggest thing I'm aiming for right now is a good scatter plot of number of refugees and gdp.

image image

Any changes in direction or topic?

No.

Problems/Questions

My scatter plot is coming up blank, so working on that. Otherwise, I overcame most of the earlier preliminary dataframe and matplotlib issues. I would love some feedback on how to tweak my graphs to make them look better/be more meaningful when I have one bar that's a lot bigger than the rest. I tried playing with the figsize, but maybe I need to play with the scale?

image

Checklist

dbaptistr commented 6 years ago

You've got some really interesting information so far! I agree with you that a scatter plot would be the best way to show GDP and refugees. I'm thinking matplotlib.pyplot.yscale might help with tweaking your graph, but I think that it has a lot of impact to see a bar much bigger than the rest. Maybe you could use the huge blank space in the graph to add some information. For example, in your "Syrian refugees in richest countries" you could add relevant numbers about Germany. How many refugees are there? What's the total GDP? I think it would also be valuable to compare countries by population. What's the rate of refugees per 100,000 citizens? Brazil and the UK have very different populations and are taking in almost the same amount of refugees, from I can see on your graph. Good job so far :)

jlstro commented 6 years ago

Matt, great project idea! Some design ideas: I'd get rid of the horizontal grid lines for the bar charts, they don't really add information. Also, the numbers on the x-axis are hard to read, I think you should convert them to million. The second bar chart is crazy because 3 out of 10 countries basically do not exist on there - but that's the message, so leave it and annotate, maybe?

mattrehbein commented 6 years ago

Final

Project visuals/text

Details

The Syrian civil war has displaced millions of people since it began roughly seven years ago. Most of those forced from their homes remained in Syria, but millions have fled to other countries. The majority of international Syrian refugees are in nearby countries. Turkey, Lebanon and Jordan host about 3.4 million, 1 million and 650,000, respectively, according to the UN Refugee Agency. I wanted to explore how the world's wealthier nations -- a measure I'm defining for the purposes of this exercise by GDP and GDP per capita -- are responding to the sprawling Syrian crisis in terms of the number of refugees they are hosting. All of the data presented here are as of 2017.

(The full text for the project is on the website.)

Here are my visuals (pls note there are no titles because I added them in html): image image image image image

Headline: Countries hosting Syrian refugees

Published website version:https://mattrehbein.github.io/

Code repository:https://github.com/mattrehbein/data_studio/tree/master/code/02-project

Final data set(s): World Bank and UNHRC

What did you find to be the most difficult part of this project?

Trying to find a meaningful narrative under the time constraints. There also appear to be different refugee numbers floating around out there from different agencies, so I spent a lot of time double checking that I wasn't misreading what I had in some fundamental way and trying to hash out why there were differences elsewhere.

Are you satisfied with what you produced? Is there anything you would like to change or improve?

I wish I could've found suitable data on my original idea of data probing US poverty rates to sort of fact check the White House's proclamation about the War on Poverty being over, but for a quick audible this project was great for developing my data skills a bit further.

I ran out of time before figuring out how to draw cute arrow lines on my graphs, which I wanted to do at least for when I entered values for bars that were too small to see. I realize it looks terrible the way I have them just stuck in there.

Checklist

kevinlitman-navarro commented 6 years ago
maxarvid commented 6 years ago

Makes you wonder what the Liechtenstein 33 are up to… Great project, the only warning I would issue is that though Jordan and Lebanon appear on the first graph, they are then excluded in favor of the economic powerhouses of the world. I'm interested in, and vaguely aware of the economic strain Jordan and Lebanon are facing, and I'm worried readers could react in a similar fashion. I do think there's a story in the current iteration, and to me it is about how the wealthiest don't really help out very much. Maybe this speaks to Kevin's second point, and could be resolved by a short para in the end reframing the issue in the suggested fashion: who is doing more to help.