Open dataninjafi opened 5 years ago
I think this has been implemented in inst/extras/create_municipality_keys.R and "kunta" has been added afterwards. This might be for compatibility reasons with other data sources (@muuankarski ?).
I also think that "Pedersöre" would be more handy. Things brings in mind two topics to decide:
1) Should we stick to the original names in the default data table, and then provide a separate wrapper that can be used if one likes to further harmonize the names or convert them into different formats, depending on the compatibility needs
2) The field "kunta_name" might be better renamed as "municipality_fi" or something?
Most probably Pedersören kunta
is correct name in Finnish. Had a quick look at few kuntadata resources (code below) and they all had Pedersören kunta
in Finnish and Pedersöre
in Swedish. Also wikipedia and their website (at the bottom) uses Pedersören kunta
. It is odd and there must be a reason for this, but I think we better stick to Pedersören kunta
in Finnish names.
As for column names, there are also name_fi
and name_sv
columns. name_fi
equals kunta_name
and therefore kunta_name
could be completely removed
# Lets query some kuntadata to see how pedersöre is written
library(dplyr)
library(rvest)
# 1. Tilastokeskuksen kuntaluokitus
## In Finnish
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index.html") %>%
html_table(fill = TRUE) %>%
.[2] %>%
.[[1]] %>%
as_tibble(.name_repair = "universal") %>%
filter(grepl("Pedersö", X2))
# X1 X2
# <int> <chr>
# 599 Pedersören kunta
## In Swedish
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index_sv.html") %>%
html_table(fill = TRUE) %>%
.[2] %>%
.[[1]] %>%
as_tibble(.name_repair = "universal") %>%
filter(grepl("Pedersö", X2))
# X1 X2
# <int> <chr>
# 599 Pedersöre
## In English
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index_en.html") %>%
html_table(fill = TRUE) %>%
.[2] %>%
.[[1]] %>%
as_tibble(.name_repair = "universal") %>%
filter(grepl("Pedersö", X2))
# X1 X2
# <int> <chr>
# 599 Pedersöre
# 2. Kuntaliitto: Alueluokat ja kuntanumerot 2019
fly <- tempfile()
download.file("https://www.kuntaliitto.fi/sites/default/files/media/file/Alueluokat%20ja%20kuntanumerot%202019.xlsx",
fly)
readxl::read_excel(fly, skip = 12) %>%
filter(grepl("Pedersö", `Kunnan nimi`)) %>%
select(1:3)
# Kuntanumero `Kunnan nimi` `Ruotsinkielilinen nimi`
# <chr> <chr> <chr>
# 599 Pedersören kunta Pedersöre
# 3. MML:n kuntarajat Paituli paikkatietopalvelusta
library(ows4R)
wfs <- WFSClient$new("http://avaa.tdata.fi/geoserver/paituli/wfs",
serviceVersion = "2.0.0",
logger = "INFO")
caps <- wfs$getCapabilities()
ft <- caps$findFeatureTypeByName("paituli:mml_hallinto_2018_10k", exact = TRUE)
shape <- ft$getFeatures()
shape %>%
filter(grepl("Pedersö", NAMEFIN)) %>%
select(NATCODE,NAMEFIN,NAMESWE)
# NATCODE NAMEFIN NAMESWE the_geom
# 599 Pedersören kunta Pedersöre MULTISURFACE (POLYGON ((287...
I think it is good to use the official names by default (and yes let's remove "kunta_name" field).
The data generation script in inst/extras/create_municipality_keys.R seems to make some modifications so let us make sure that the names are kept in their official formats.
If there is a need we can add wrappers that can convert the official names to shorter or other alternative forms for the names.
In municipality_key_2019$kunta_name Pedersöre's name is "Pedersören kunta" while other municipalities lack " kunta" ending.