CorrelAid / CitiesRopen

Wrapper around the DKAN API of OpenData Konstanz
Other
2 stars 2 forks source link

CitiesRopen

Project Status: Active – The project has reached a stable, usable
state and is being actively
developed. Lifecycle:
experimental CRAN
status

This packages allows you to directly inspect and download data from the Open Data Portal of Constance. It can be easily used by practioniers, members of the civil society and academics and expects users to have only a basic understanding of R. Technically, the package relies on the DKAN API.

Features

Usage

Installation

You can install the package directly from Github using the install_github function from the devtools package as shown below. Please make sure, that you have the devtools package locally installed on your machine before starting the download.

install.packages("devtools")
devtools::install_github("CorrelAid/CitiesRopen")

Structure

The package provides to major functions, which functionally build on each other. First, you always have to call show_data to get an overview over the files in the data portal. As described below, you can combine your query with different filter arguments to restrict your search to only files of interest. Once you have restricted your query, you can start downloading the files using the get_data function. In order to connect both functions, you have to use the pipe operator %>% from margritter.

#Basic Structure (without filter arguments)
show_data() %>% 
  get_data()

show_data()

The show_data function calls the API and retrieves a complete list of data files in the data portal. Internally, it calls the API and creates a list of all data sets available in the portal. Depending on the argument specified by the user, this list is then filtered accordingly.

In terms of terminology, a single file represents one specific document with a unique name and format. Several files are then assigned to a smaller number of ressources. For instance, the ressource Wahlergebnisse Konstanzer Oberbürgermeisterwahlen contains several files, such as the election results for the cities mayor ranging from 1996, 2004, 2012 to 2020. In a last step, at least one category is then assigned to each resource. Categories represent a specific thematic focus, such as Politik und Wahlen, Soziales and Umwelt und Klima. Frequently, more then one tag is assigned to a resource.

In the data portal, files are referred to as Dateien und Quellen, resources as Data Sets and categories as Kategorien.

The following arguments are available for show_data():

get_data()

Choose from: “environment” (default) or "local

If you want to read the data directly in R, you can use the default setting for “environment”, which saves the data in a new list List_Open_Data, where each element of the list represents one data file. If you want to download the files directly to your local machine, please specify download = "local".

Use Cases

Use Case 1: Filtering with filter for category and format

CitiesRopen::show_data(category = "Politik und Wahlen", format = "csv") %>% 
  CitiesRopen::get_data()

Use Case 2: Filtering with filter for file

CitiesRopen::show_data(file = "Wanderung nach Staaten") %>% 
  CitiesRopen::get_data()

Use Case 3: Calling the function without a message

CitiesRopen::show_data(message = F) %>% 
  CitiesRopen::get_data()

Use Case 4: Downloading the files in a local directory

CitiesRopen::show_data(category = "Geo", format = "csv") %>% 
  CitiesRopen::get_data(download = "Local")