jsoma / data-studio-projects

12 stars 18 forks source link

02-Tobacco-Exports #173

Open kellykiki opened 6 years ago

kellykiki commented 6 years ago

Please complete all of the following sections, or the ghost of Joseph Pulitzer will spookily dance around your issue! A completed version of this template can be found at https://github.com/jsoma/data-studio-projects/issues/1

Pitch

Summary

This UN database provides current and historical data about tobacco/cigarettes exports and imports by country. I would like to find the "big players" for 2017 (additional inspiration here) and look back to discover how they have been performing over the years.

I wish this project could potentially be combined with an additional story about major areas of production per top-country and workers on tobacco farms.

Details

Possible headline(s):

Data set(s): http://data.un.org/Data.aspx?d=ComTrade&f=_l1Code%3a25

Code repository: https://github.com/kellykiki/data-studio/tree/master/code/02-project

Possible problems/fears/questions: Possible problem: It might not stand as a story by itself. Fear? Time management. Questions? Sure! Arising while coding. At the moment, my major concern is to figure out how I am handling trade values ($billions) and weight values (kilos in millions and thousands) on axes.

Work so far

-Cleaning data and transforming dataframes -Some analysis -Draft graphs produced

2017 Tobacco Exports Ranking tob_ranking

Trade ($) by top-country tob_trade

Exported tobacco weight by top-country tob_weight

Improvements:

Checklist

This checklist must be completed before you submit your draft.

jsoma commented 6 years ago
kellykiki commented 6 years ago

Update

Content

Graph 1 top-20-exporters-2017

Graph 2 - notes: -I' ll need to fix ticks labels on the horizontal axis. I need them to end up with "5 billions (US$)". I've tried the FuncFormatter, but I didn't make it for this specific graph -How can we change the fontsize of the legend title? domestic-goods

Graph 3 - notes: -Once I moved the legend to a better position, I lost the legend title. Why? Here is my only code regarding legend: ax.legend(title='Tobacco goods') ax.legend(loc='lower right') -How can I "hide" countries that do not re-export anything (vertical axis? Can I do it using matplotlib or is it time for me to discover Illustrator? -Again, I have problem setting ticks values on the horizontal axis for this specific graph. I' ll need them to end up "2 billions (US$)". re-exports-foreign-goods

Graph 4 - notes: -I didn't directly label at the end of the lines, since I added more lines and I thought it might be "noisy". I' d love to know, though, how we can do it with some code in matplotlib/pandas. It would be nice for a few lines, as I had in my first draft graph -How we can change legend labels (not only the title)? For, example, I' ll need to turn the "german_usd" label into "Germany" top-5-exporters-overtime

Any changes in direction or topic?

Not really

Problems/Questions

In the original database, weight values are provided for various categories of tobacco products exported by each country. Because I' ve already worked on the trade value (USD) of each country's exports, I thought that it would be nice if we knew not only how much each country earns but also how many tons of tobacco each country exports. In this way, we can know which countries are cheaper and which ones are the most expensive.

So, I' ve transformed the original database a lot, I did many calculations, I produced the graph; and it was only then that I discovered that there are NaN values in the weight column for certain categories of tobacco products exported by specific countries. Therefore, I cannot show the graph displaying weight of tobacco exported over the years (line chart).

To overcome this and give you a sense of cheapest/most expensive countries, i thought that I can:

What do you think about that?

Checklist

playfairbot commented 6 years ago

Greetings! I'm a little robot, let's take a peek.

You need some feedback, let me summon @ksliney, @ElinaMak, @xeophin for you

xeophin commented 6 years ago

Hey Kelly, this looks good (and like quite some work!). I also like the colour scheme, and how the chart background seems in dire need of a repaint after years of indoor smoking ... ;)

Graph 1

At work I usually don't tend to call out the top entity, as it is quite prominent based on its size alone, so this would be somewhat redundant. You could either go with a country that your story revolves around (like the US, whose exports seems to have dropped considerable according to chart 4 – why?), the country of the publication the article is appearing in (helps reader compare "themselves" to other countries) – or you could have a bar that displays the average, to give a hint who's above or below average.

Chart 3

Yes, the countries with no exports could probably go.

Graph 4

I think this chart could also benefit from one line being shown prominently, depending on what story you want to tell – the decline of the US? The rise of Germany? Poland as a new contender?

kellykiki commented 6 years ago

Update - Revision 2

Content

Graph 1 top-20-exporters-2017

Graph 2 domestic

Graph 3 foreign-goods

Graph 4 top-5-exporters-overtime

Graph 5 trade-value-percentages

Graph 6 price-per-kg Note: China, USA and Indonesia are excluded from this analysis, because weight data are not provided for some categories of exported tobacco products.

Any changes in direction or topic?

Not really

Problems/Questions

I overcame problems regarding graphs that I described in Revision 1.

The most significant and hard part of the analysis process was to decide how I am handling missing data about weight of exported goods for specific categories of some countries' shipments. That problem affected the possible analysis regarding price/kg that I calculated and which is presented in Graph 6: After a lot of thought and many different ways that I tried to do the analysis, I finally decided to exclude China, USA and Indonesia from this specific analysis, as those countries' weight data are missing for certain categories of exported tobacco. I concluded that this is the most honest and reliable way to handle it.

Checklist

kellykiki commented 6 years ago

Hey, @xeophin! Thanks for your feedback! I've worked on Revision 2 in the meanwhile. So, here is the upd work, if you want to have another look at it. I'll probably check the "one prominent line" suggestion for the final version ;) Τhanks!

maxarvid commented 6 years ago

Fascinating! I'll go graph by graph:

  1. I like the subtitle annotation, it really drives home that Germany exports way more than any other country (and as such doesn't need a grid imo). If you do want to try something you could make it only five countries and put the values that they are exporting inside the bars, but this would lose the sense of how quickly exports drop when you move away from the big players so only if you absolutely want to change something.

  2. This is my least favorite for the following reasons: I'm not sure what the sub-header annotation means ("Dollar value is 94%..."), there are too many different kinds of tobacco products and I have no idea what their names mean, and their distributions seem to mean something but I need some help understanding what (is it significant that Brazil mostly exports tobacco refuse, for example?).

  3. There's a lot of blank space in this one (not necessarily bad), and the inclusion of Italy given the sub-header is throwing me off. Are the three the only countries that are re-exporting tobacco products? Maybe also simplify the legend and drop the products that are not being re-exported.

  4. I love this one! The only thing I would change is scrapping the black space between 1994 and the y-axis.

  5. Give Germany a stand out color like in graph 1. Put the percent sign in the German box and let the reader be clever.

  6. To drive home that the red horizontal bar is the mean, the lettering of the sub-header could be colored red. Also, the larger dots at either end look a bit distracting, I would suggest keeping them all the size. Just to be clear, are these rates based on all different tobacco products lumped together?

Really nice looking graphs!

kellykiki commented 6 years ago

@maxarvid. thanks for your wonderful and detailed feedback! I've already followed most of your suggestions! I keep all remarks in mind and I'd love to update the website after the final version submission ;)

kellykiki commented 6 years ago

Final

Project visuals/text

After feedback, I did some changes mostly regarding:

I also reviewed some of my headlines and my website texts.

Graph 1

Top-20 Countries in Tobacco Cigarettes Exports (2017)

Germany is the “big player”

top-20-exporters-2017

Graph 2

Exports of Domestic Tobacco Goods by Top Country

Dollar value of domestic goods exported amounts to 94% of the total tobacco trade value

domestic-goods

Graph 3

Re-Exports by Top Country

USA, China and Italy are the only exporters of foreign tobacco goods

foreign-goods

Graph 4

US$ Value Shares on Total Exports Worldwide

German exports value is 15% of the global total

trade-value-pct

Graph 5

Top Exporters' Rates

Average price per kg is 10.36 US$.

price-per-kg

Graph 6

How Top-5 Exporters Perform Over The Years

It seems that 2004 is a milestone for both growing and weakend forces

top-5-exporters-overtime

Details

Headline:

TOBACCO EXPORTS WORLDWIDE

Germany drives tobacco market

Published website version: https://kellykiki.github.io/tobacco/

Code repository: https://github.com/kellykiki/data-studio/tree/master/code/02-project

Final data set(s): http://data.un.org/Data.aspx?d=ComTrade&f=_l1Code%3a25

What did you find to be the most difficult part of this project?

The most difficult part of this project was to transform the original database creating new dataframes and merging some of them later, so that I can get the desired results. The weight data issue was also a significant difficulty/ limitation; please find more details on that in the "Previous/ Problems" section in the Updates of this project.

Are you satisfied with what you produced? Is there anything you would like to change or improve?

I am always happy with the process of conceptualizing how I can visualize my analysis. I am happy with my color schemes. I would like to improve the stacked bar chart about domestic goods: specifically, I would like to have the time analyzing its type of commodity separetely -- I think it would give make information about countries/commodities clearer and it might give us further information on dominant countries by commodity.

If I had more time, I would also like to analyze imports, probably by benefiting of additional interesting databases that I found on the way.

Checklist

malbasi commented 6 years ago

Hey Kelly,

Your final project turned out great. You did some wonderful viz with a ton of information. Great job!

While it is all legible (and beautiful) I feel like it would be more impactful if you simplified some of the graphs. I don't mean the design, but the actual data inside. I mean, in graph 2, since you show so much data about the types, maybe limit the number of countries you're showing the breakdown of to just the top 3 instead of the top 20.

jlstro commented 6 years ago

Hey! Didn't realize that Germans smoke so much, crazy... There is actually much more info in the final project site than I expected, so great job on that. A few things that I'd do differently (even though you are finished now, I assume): The different background colors confuse me. Why not one for all charts? For the second chart, I'd take advantage of our journalistic freedom and group/simplify the categories. There's too much text in my opinion and some of the segments are really small. So are some of the fonts, a bit hard to read here and there. Other than that, great page!