holtzy / R-graph-gallery

A website that displays hundreds of R charts with their code
https://www.r-graph-gallery.com
MIT License
668 stars 230 forks source link

Add #101

Open holtzy opened 1 year ago

holtzy commented 1 year ago

https://github.com/gkaramanis/tidytuesday/blob/master/2023/2023-week_34/refugees.R

JosephBARBIERDARNAL commented 2 weeks ago

can't reproduce it. it works in .R env but not inside and .Rmd, and I don't understand why. here is the Rmd I wrote just in case:

---
title: "Analyzing Refugee Flows to Europe (2010-2022): An Alluvial Visualization"
descriptionMeta: "Explore a detailed alluvial visualization of refugee flows to Europe from 2010 to 2022 using R and ggsankey."
descriptionTop: "This post presents an **alluvial plot** using R and [ggsankey](package/ggsankey.html) to visualize refugee migration trends to European countries over a span of 12 years. It highlights the major regions and the number of refugees contributing to the total."
sectionText: "Alluvial Plot"
sectionLink: "package/ggsankey.html"
DataToVizText: "Data to Viz"
DataToVizLink: "data-to-viz.com/graph/alluvial.html"
url: "web-combine-sankey-and-bump-charts-in-R"
output:
  html_document:
    self_contained: false
    mathjax: default
    lib_dir: libs
    template: template_rgg.html
    css: style.css
    toc: TRUE
    toc_float: TRUE
    toc_depth: 2
    df_print: "paged"
---

```{r global options, include = FALSE}
knitr::opts_chunk$set(
   warning = FALSE,
   message = FALSE,
   fig.align = "center"
)
# About *** This post visualizes refugee migration trends to Europe from 2010 to 2022 using an alluvial plot. The data comes from the UNHCR Refugee Population Statistics Database, highlighting significant refugee movements from various countries to European nations. The plot emphasizes major refugee origins and the number of refugees over time, providing a clear picture of how global events shape migration patterns. # Libraries and Data *** We'll use several R packages to create this visualization: `tidyverse` for data manipulation, `ggsankey` for the alluvial plot and `ggh4x` for enhanced axis ticks ```{r} library(tidyverse) library(ggsankey) library(ggh4x) ``` The data is sourced directly from the TidyTuesday dataset repository and includes refugee statistics categorized by the country of origin and asylum, specifically focusing on European countries. ```{r} # Load data population <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-08-22/population.csv") # European countries eu <- rnaturalearthdata::countries110 %>% as_tibble() %>% filter(continent == "Europe") # Refugees by year to Europe ref_eu_year <- population %>% mutate( coo_name = case_when( str_detect(coo_name, "Syria") ~ "Syria", str_detect(coo_name, "Serbia") ~ "Serbia and Kosovo", TRUE ~ coo_name ) ) %>% filter(coa_iso %in% eu$sov_a3) %>% group_by(year, coo_name) %>% summarise( total = sum(refugees) ) %>% ungroup() %>% group_by(year) %>% mutate(top_year = total == max(total)) %>% ungroup() # Regions with > 100 000 refugees ref_top <- ref_eu_year %>% group_by(coo_name) %>% summarise(total = sum(total)) %>% ungroup() %>% arrange(-total) %>% filter(total >= 1e5) # Annotations for highlighted regions annot <- tribble( ~country, ~x, ~y, ~yr1, ~yr2, "Serbia and Kosovo", 2011, 1.5e6, 2010, 2012, "Iraq", 2013, 1.3e6, 2013, 2013, "Ukraine", 2014.5, 1.8e6, 2014, 2015, "Syria", 2018, 2.6e6, 2016, 2021, "Ukraine", 2020.7, 5e6, 2022, 2022 ) %>% rowwise() %>% mutate( tot = sum(subset(ref_eu_year, coo_name == country & between(year, yr1, yr2))[, "total"]), label = paste0("**", country, "**", if_else(yr1 != yr2, paste0("
", yr1, "-", yr2), paste0("
", yr2)), "
", "**", scales::number(tot, accuracy = 0.1, scale_cut = scales::cut_short_scale()), "**") ) %>% ungroup() ``` # Visualization *** We use `ggsankey` to create an alluvial plot that illustrates the refugee flow over time, highlighting significant migration patterns and key events. Annotations are used to label major peaks and trends. ```{r} f1 <- "Outfit" f2 <- "Newsreader Display" ref_eu_year %>% filter(coo_name %in% ref_top$coo_name) %>% ggplot() + geom_sankey_bump(aes(x = year, node = coo_name, fill = if_else(coo_name %in% annot$country, coo_name, NA), value = total, color = after_scale(colorspace::lighten(fill, 0.4))), linewidth = 0.3, type = "alluvial", space = 1e4, alpha = 0.9) + ggtext::geom_richtext(data = annot, aes(x = x, y = y, label = label, color = country, size = tot), vjust = 0, family = f1, fill = NA, label.color = NA) + annotate("text", x = 2022.2, y = 7.3e6, label = "Totals for\n2010-2022\n↓", family = f1, hjust = 0, vjust = 1, color = "grey30") + geom_text(data = ref_top %>% filter(row_number() <= 5), aes(x = 2022.2, y = c(5e6, 1.9e6, 1.25e6, 1e6, 0.8e6), label = scales::number(total, accuracy = 0.1, scale_cut = scales::cut_short_scale())), family = f1, hjust = 0, color = c("#0057B7", "#CE1126", "grey10", "#017b3d", "grey10")) + annotate("text", x = c(2020, 2020.5), y = c(1.28e6, 0.85e6), label = c("Afghanistan", "Eritrea"), color = "grey99", family = f1, size = c(4, 3.2)) + annotate("text", x = 2010, y = 7.2e6, label = "Where do the 22 million refugees in Europe come from?", size = 8, family = f2, fontface = "bold", hjust = 0, color = "#181716") + annotate("text", x = 2010, y = 6.8e6, label = str_wrap("The chart displays rankings and the number of refugees from each region from 2010 to 2022. Top-ranked regions during different periods are highlighted. Only regions with a total of at least 100 000 refugees for the period 2010-2022 are included.", 78), size = 5.5, family = f1, hjust = 0, vjust = 1, color = "#393433") + scale_x_continuous(breaks = seq(2010, 2022, 2), guide = "axis_minor") + scale_fill_manual(values = c("#017b3d", "#0C4077", "#CE1126", "#0057B7"), na.value = "grey60") + scale_color_manual(values = c("#017b3d", "#0C4077", "#CE1126", "#0057B7"), na.value = "grey60") + scale_size_continuous(range = c(4, 6)) + coord_cartesian(clip = "off", expand = FALSE) + labs( caption = "Souce: UNHCR Refugee Population Statistics Database · Graphics: Georgios Karamanis" ) + theme_minimal(base_family = f1) + theme( legend.position = "none", plot.background = element_rect(fill = "#FFFFFE", color = NA), axis.title = element_blank(), axis.text.y = element_blank(), axis.text.x = element_text(size = 12, margin = margin(5, 0, 0, 0), family = f1, color = "#393433"), panel.grid = element_blank(), axis.ticks.x = element_line(color = "grey70"), ggh4x.axis.ticks.length.minor = rel(1), plot.margin = margin(20, 75, 10, 35), plot.caption = element_text(margin = margin(20, 0, 0, 0)) ) ``` # Going further *** The alluvial plot reveals crucial insights into the refugee crisis and its impact on Europe. Key events, such as conflicts in Syria and Ukraine, have significantly influenced the refugee landscape, as indicated by peaks in the chart. By visualizing these data trends, policymakers and analysts can better understand the dynamics of refugee movements and address humanitarian needs effectively.