Open mattrehbein opened 6 years ago
To add another element (and data set) to the project, I got GDP data from the World Bank, in order to see how it relates to Syrian refugee hosting (how are the world's richest nations responding to one of the world's biggest and most publicized crises).
I plan on seeing how a scatter plot looks when comparing refugee levels and gdp, but I don't have much to show graph-wise at the moment. I keep getting an "empty dataframe" error when trying to plot that I haven't been able to fix yet.
Here's what I'm aiming for as of now: -which countries host the most Syrian refugees and their average gdp -how many refugees do wealthy nations host (think I'll cut it off at top 10 or 20 gdp) -what the US is doing
The last point there might be a bit of work, only because the UN's numbers on how many Syrian refugees are in US appear to be a good bit different than I'm seeing in other news reports that cite State Dept. data. I'll go after that data as time permits.
Including GDP information.
As ever, cleaning even the simplest data is very time-consuming with lots of trouble shooting. But the biggest problem right now is my empty df error with matplotlib.
Hi! I'm a little robot, here for a surprise inspection.
You need some feedback, let me summon @pasiegrist, @kidaemon, @SimoneLuc for you
It looks like we need to fix up your your update a little bit! Edit it by clicking the pencil in the top right-hand corner. It requires:
Maybe you just didn't use the template? If not, edit your comment, cut and paste the template in, and then fill it out.
Hey Matt
I really like your approach of adding additional data to the analysis. To use GDP makes sense to me. Maybe you could also do the math for syrian refugees per population? Or do a ratio of it to account for population and economic-size?
As you have not drawn up some graphs yet, there is no critique there :) But I think going with bar charts should be save and sound. Good luck with your data struggle. If you want any help, feel free to walk up.
Got a few basic graphs made. I've joined gdp onto my refugee data set, but there are a lot of discrepancies in the country names between the two df's, so I'm still working on cleaning that up. The biggest thing I'm aiming for right now is a good scatter plot of number of refugees and gdp.
No.
My scatter plot is coming up blank, so working on that. Otherwise, I overcame most of the earlier preliminary dataframe and matplotlib issues. I would love some feedback on how to tweak my graphs to make them look better/be more meaningful when I have one bar that's a lot bigger than the rest. I tried playing with the figsize, but maybe I need to play with the scale?
You've got some really interesting information so far! I agree with you that a scatter plot would be the best way to show GDP and refugees. I'm thinking matplotlib.pyplot.yscale might help with tweaking your graph, but I think that it has a lot of impact to see a bar much bigger than the rest. Maybe you could use the huge blank space in the graph to add some information. For example, in your "Syrian refugees in richest countries" you could add relevant numbers about Germany. How many refugees are there? What's the total GDP? I think it would also be valuable to compare countries by population. What's the rate of refugees per 100,000 citizens? Brazil and the UK have very different populations and are taking in almost the same amount of refugees, from I can see on your graph. Good job so far :)
Matt, great project idea! Some design ideas: I'd get rid of the horizontal grid lines for the bar charts, they don't really add information. Also, the numbers on the x-axis are hard to read, I think you should convert them to million. The second bar chart is crazy because 3 out of 10 countries basically do not exist on there - but that's the message, so leave it and annotate, maybe?
The Syrian civil war has displaced millions of people since it began roughly seven years ago. Most of those forced from their homes remained in Syria, but millions have fled to other countries. The majority of international Syrian refugees are in nearby countries. Turkey, Lebanon and Jordan host about 3.4 million, 1 million and 650,000, respectively, according to the UN Refugee Agency. I wanted to explore how the world's wealthier nations -- a measure I'm defining for the purposes of this exercise by GDP and GDP per capita -- are responding to the sprawling Syrian crisis in terms of the number of refugees they are hosting. All of the data presented here are as of 2017.
(The full text for the project is on the website.)
Here are my visuals (pls note there are no titles because I added them in html):
Headline: Countries hosting Syrian refugees
Published website version:https://mattrehbein.github.io/
Code repository:https://github.com/mattrehbein/data_studio/tree/master/code/02-project
Final data set(s): World Bank and UNHRC
Trying to find a meaningful narrative under the time constraints. There also appear to be different refugee numbers floating around out there from different agencies, so I spent a lot of time double checking that I wasn't misreading what I had in some fundamental way and trying to hash out why there were differences elsewhere.
I wish I could've found suitable data on my original idea of data probing US poverty rates to sort of fact check the White House's proclamation about the War on Poverty being over, but for a quick audible this project was great for developing my data skills a bit further.
I ran out of time before figuring out how to draw cute arrow lines on my graphs, which I wanted to do at least for when I entered values for bars that were too small to see. I realize it looks terrible the way I have them just stuck in there.
Makes you wonder what the Liechtenstein 33 are up to… Great project, the only warning I would issue is that though Jordan and Lebanon appear on the first graph, they are then excluded in favor of the economic powerhouses of the world. I'm interested in, and vaguely aware of the economic strain Jordan and Lebanon are facing, and I'm worried readers could react in a similar fashion. I do think there's a story in the current iteration, and to me it is about how the wealthiest don't really help out very much. Maybe this speaks to Kevin's second point, and could be resolved by a short para in the end reframing the issue in the suggested fashion: who is doing more to help.
Pitch
Summary
I want to explore where Syrian refugees have gone (country-wise) since the country's civil war began in 2011. Since I have data since that period, I'd also like to see whether there's anything interesting about how the total number of Syrian refugees has changed year to year. My data comes from UNHRC.
Details
Possible headline(s): Where Syrian Refugees Go
Data set(s): http://popstats.unhcr.org/en/persons_of_concern
Code repository: https://github.com/mattrehbein/data_studio/tree/master/code/02-project
Possible problems/fears/questions:
Work so far
I've found my data set on unhrc, but am having a hard time getting pandas to accept the csv I downloaded, so work thus far is pretty limited. But here's what I'd like to end up with, but focused on Syria specifically, rather than all refugees worldwide.
Checklist