{prenoms}
(namely “firstnames”) allows you to explore the data on
first names given to children born in metropolitan France between 1900
and 2021.
These data are available at the French level and by department.
Source: These statistics come from the French civil status. They have been collected by the National Institute of Statistics and Economic Studies (Insee), that collects, analyses and disseminates information on the French economy and society. These statistics are available here.
# install.packages("devtools")
devtools::install_github( "ThinkR-open/prenoms" )
library("prenoms")
Load package and its data:
library(prenoms)
data("prenoms_france")
data("prenoms")
data("departements")
Example of study with names from current ThinkR staff through time:
library(ggplot2)
library(dplyr)
library(tidyr)
library(purrr)
Let’s define a dataset holding our names and genders:
team_members <- tribble(
~name, ~sex,
"Colin", "M",
"Diane", "F",
"Sébastien", "M",
"Cervan", "M",
"Vincent", "M",
"Margot", "F",
"Estelle", "F",
"Arthur", "M",
"Antoine", "M",
"Florence", "F",
"Murielle", "F",
"Swann", "F",
"Yohann", "M"
)
And then craft a function that will retrieve only the names corresponding to our own names.
get_thinkr_team_name_data <- function(
prenoms_df,
team_members_df
) {
prenoms_df %>%
# Get data corresponding only to team member names
inner_join(
team_members,
by = c("name", "sex")
) %>%
# Add missing combination for name x year
complete(
name = team_members$name,
year = 1900:2021,
fill = list( n = 0, prop = 0 )
) %>%
group_by(name, year, sex) %>%
summarise(
n = sum(n),
.groups = "drop"
) %>%
arrange(year) %>%
# If sex is not define (NA) we assumed it was
# the same as the corresponding team member's
mutate(
sex = map2_chr(
sex,
name,
function(
sex,
name
) {
ifelse(
is.na(sex) & name %in% team_members$name,
team_members$sex[team_members$name == name],
sex
)
}
)
)
}
# Data for the whole France
data(prenoms_france)
thinkrs <- get_thinkr_team_name_data(
prenoms_df = prenoms_france,
team_members_df = team_members
)
thinkrs %>%
ggplot() +
aes(x = year, y = n, color = name) +
geom_line() +
scale_x_continuous( breaks = seq(1900, 2021, by = 10) ) +
labs(title = "ThinkR's team names evolution in France") +
theme_bw()
# Data by "départment"
data(prenoms)
thinkrs_93 <- prenoms %>%
filter(dpt == 93) %>%
get_thinkr_team_name_data(
team_members
)
thinkrs_93 %>%
ggplot() +
aes(x = year, y = n, color = name) +
geom_line() +
scale_x_continuous( breaks = seq(1900, 2021, by = 10) ) +
labs(title = "ThinkR's team names evolution in the 93 department") +
theme_bw()
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.