DS4PS / ddmp-uw-class-spring-2019

Data-Driven Management & Policy Course at UW, Spring 2019
https://ds4ps.github.io/ddmp-uw-class-spring-2019/
4 stars 3 forks source link

Seeting an API #10

Open aperaltasantos opened 5 years ago

aperaltasantos commented 5 years ago

Dear all,

I am trying to set an API access to my data, and I am running into some problems,

So far, I have been able to access the API but it gives me back a complex object with only part of the observation. The code is bellow

http://rpubs.com/aperalta/493979 (Links to an external site.)Links to an external site.

The original website of the API

https://transparencia.sns.gov.pt/explore/dataset/atividade-de-prestacao-do-sns-24/api/?disjunctive.indicador&disjunctive.dia_da_semana&disjunctive.numero_da_semana&sort=periodo (Links to an external site.)Links to an external site.

Could you help me?

ejvanholm commented 5 years ago

I feel as if I'm missing something with your question.

What is the structure of data_df after you use fromJSON?

lecy commented 5 years ago

The challenge is that the data is returned as a list. Lists are a little challenging to navigate. Here is an example extracting the "fields" data from the "records" list.

JSON from API:

web.text <- '{"nhits": 9207, "parameters": {"dataset": ["atividade-de-prestacao-do-sns-24"], "timezone": "UTC", "rows": 10, "sort": ["periodo"], "format": "json", "facet": ["periodo", "indicador", "dia_da_semana", "numero_da_semana"]}, "records": [{"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "2e7a35a8201af502d87ed80861d244f8ea9425b0", "fields": {"rowid": 27878, "taxa": 0.010638297872, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ CIAV", "valor_absoluto": 34, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "f3708f4f88c5ddcacbca818fa553d5bee9220f5f", "fields": {"rowid": 27876, "taxa": 0.193992490613, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ CSP", "valor_absoluto": 620, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "e1c8df84badcb467d33df761df7a0395faea3f84", "fields": {"rowid": 27875, "taxa": 0.219336670838, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ Autocuidados", "valor_absoluto": 701, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "f5eef4be3f6eeef9e704fa1f92ed68e576f93c53", "fields": {"rowid": 27879, "taxa": 0.047246558197, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ INEM", "valor_absoluto": 151, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "0285b6825a106b8049cb8eb89db4ed9f14086225", "fields": {"rowid": 27877, "taxa": 0.350125156445, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ Servi\u00e7o Urg\u00eancia", "valor_absoluto": 1119, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "d757bf3ddc0ad8cb16c72f7e9684febffe7b2759", "fields": {"rowid": 27880, "taxa": 0.178660826032, "periodo": "2019-05-19", "indicador": "Encaminhamentos p/ Outros", "valor_absoluto": 571, "numero_da_semana": 21, "dia_da_semana": "domingo"}, "record_timestamp": "2019-05-20T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "500220fa40b6611c92acfe19926a557985d8853b", "fields": {"rowid": 27844, "taxa": 0.248043552228, "periodo": "2019-05-18", "indicador": "Encaminhamentos p/ Autocuidados", "valor_absoluto": 729, "numero_da_semana": 20, "dia_da_semana": "s\u00e1bado"}, "record_timestamp": "2019-05-19T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "89824f9cab5bf556fb7c098b9dc346ecc37139d4", "fields": {"rowid": 27848, "taxa": 0.048315753657, "periodo": "2019-05-18", "indicador": "Encaminhamentos p/ INEM", "valor_absoluto": 142, "numero_da_semana": 20, "dia_da_semana": "s\u00e1bado"}, "record_timestamp": "2019-05-19T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "c68036b33a36fda8f3a739238ba3bfed8de4d74d", "fields": {"rowid": 27845, "taxa": 0.154814562776, "periodo": "2019-05-18", "indicador": "Encaminhamentos p/ CSP", "valor_absoluto": 455, "numero_da_semana": 20, "dia_da_semana": "s\u00e1bado"}, "record_timestamp": "2019-05-19T07:57:07+00:00"}, {"datasetid": "atividade-de-prestacao-do-sns-24", "recordid": "6eeba5a6dc8057ad25dbeddf1067f52955b8d467", "fields": {"rowid": 27849, "taxa": 0.187478734263, "periodo": "2019-05-18", "indicador": "Encaminhamentos p/ Outros", "valor_absoluto": 551, "numero_da_semana": 20, "dia_da_semana": "s\u00e1bado"}, "record_timestamp": "2019-05-19T07:57:07+00:00"}], "facet_groups": [{"name": "periodo", "facets": [{"name": "2016", "path": "2016", "count": 336, "state": "displayed"}, {"name": "2017", "path": "2017", "count": 3963, "state": "displayed"}, {"name": "2018", "path": "2018", "count": 3942, "state": "displayed"}, {"name": "2019", "path": "2019", "count": 966, "state": "displayed"}]}, {"name": "indicador", "facets": [{"name": "Encaminhamentos p/ Outros", "path": "Encaminhamentos p/ Outros", "count": 1536, "state": "displayed"}, {"name": "Encaminhamentos p/ INEM", "path": "Encaminhamentos p/ INEM", "count": 1535, "state": "displayed"}, {"name": "Encaminhamentos p/ CIAV", "path": "Encaminhamentos p/ CIAV", "count": 1534, "state": "displayed"}, {"name": "Encaminhamentos p/ Autocuidados", "path": "Encaminhamentos p/ Autocuidados", "count": 1533, "state": "displayed"}, {"name": "Encaminhamentos p/ CSP", "path": "Encaminhamentos p/ CSP", "count": 1533, "state": "displayed"}, {"name": "Encaminhamentos p/ Servi\u00e7o Urg\u00eancia", "path": "Encaminhamentos p/ Servi\u00e7o Urg\u00eancia", "count": 1533, "state": "displayed"}, {"name": "Chamadas Abandonadas at\u00e9 15s", "path": "Chamadas Abandonadas at\u00e9 15s", "count": 1, "state": "displayed"}, {"name": "Chamadas Atendidas", "path": "Chamadas Atendidas", "count": 1, "state": "displayed"}, {"name": "Chamadas Oferecidas", "path": "Chamadas Oferecidas", "count": 1, "state": "displayed"}]}, {"name": "dia_da_semana", "facets": [{"name": "domingo", "path": "domingo", "count": 1325, "state": "displayed"}, {"name": "segunda-feira", "path": "segunda-feira", "count": 1318, "state": "displayed"}, {"name": "s\u00e1bado", "path": "s\u00e1bado", "count": 1318, "state": "displayed"}, {"name": "quarta-feira", "path": "quarta-feira", "count": 1312, "state": "displayed"}, {"name": "quinta-feira", "path": "quinta-feira", "count": 1312, "state": "displayed"}, {"name": "sexta-feira", "path": "sexta-feira", "count": 1312, "state": "displayed"}, {"name": "ter\u00e7a-feira", "path": "ter\u00e7a-feira", "count": 1310, "state": "displayed"}]}, {"name": "numero_da_semana", "facets": [{"name": "3", "path": "3", "count": 252, "state": "displayed"}, {"name": "51", "path": "51", "count": 252, "state": "displayed"}, {"name": "52", "path": "52", "count": 252, "state": "displayed"}, {"name": "50", "path": "50", "count": 242, "state": "displayed"}, {"name": "2", "path": "2", "count": 226, "state": "displayed"}, {"name": "4", "path": "4", "count": 225, "state": "displayed"}, {"name": "8", "path": "8", "count": 210, "state": "displayed"}, {"name": "12", "path": "12", "count": 210, "state": "displayed"}, {"name": "17", "path": "17", "count": 210, "state": "displayed"}, {"name": "16", "path": "16", "count": 207, "state": "displayed"}, {"name": "7", "path": "7", "count": 204, "state": "displayed"}, {"name": "13", "path": "13", "count": 204, "state": "displayed"}, {"name": "1", "path": "1", "count": 197, "state": "displayed"}, {"name": "9", "path": "9", "count": 189, "state": "displayed"}, {"name": "11", "path": "11", "count": 189, "state": "displayed"}, {"name": "18", "path": "18", "count": 184, "state": "displayed"}, {"name": "10", "path": "10", "count": 180, "state": "displayed"}, {"name": "5", "path": "5", "count": 179, "state": "displayed"}, {"name": "15", "path": "15", "count": 179, "state": "displayed"}, {"name": "19", "path": "19", "count": 179, "state": "displayed"}, {"name": "6", "path": "6", "count": 178, "state": "displayed"}, {"name": "14", "path": "14", "count": 178, "state": "displayed"}, {"name": "20", "path": "20", "count": 177, "state": "displayed"}, {"name": "29", "path": "29", "count": 168, "state": "displayed"}, {"name": "30", "path": "30", "count": 168, "state": "displayed"}, {"name": "33", "path": "33", "count": 168, "state": "displayed"}, {"name": "34", "path": "34", "count": 168, "state": "displayed"}, {"name": "38", "path": "38", "count": 168, "state": "displayed"}, {"name": "39", "path": "39", "count": 168, "state": "displayed"}, {"name": "42", "path": "42", "count": 168, "state": "displayed"}, {"name": "43", "path": "43", "count": 168, "state": "displayed"}, {"name": "47", "path": "47", "count": 168, "state": "displayed"}, {"name": "48", "path": "48", "count": 168, "state": "displayed"}, {"name": "46", "path": "46", "count": 167, "state": "displayed"}, {"name": "35", "path": "35", "count": 165, "state": "displayed"}, {"name": "37", "path": "37", "count": 165, "state": "displayed"}, {"name": "49", "path": "49", "count": 163, "state": "displayed"}, {"name": "28", "path": "28", "count": 162, "state": "displayed"}, {"name": "44", "path": "44", "count": 162, "state": "displayed"}, {"name": "31", "path": "31", "count": 153, "state": "displayed"}, {"name": "41", "path": "41", "count": 153, "state": "displayed"}, {"name": "32", "path": "32", "count": 147, "state": "displayed"}, {"name": "40", "path": "40", "count": 147, "state": "displayed"}, {"name": "27", "path": "27", "count": 139, "state": "displayed"}, {"name": "45", "path": "45", "count": 139, "state": "displayed"}, {"name": "36", "path": "36", "count": 138, "state": "displayed"}, {"name": "21", "path": "21", "count": 132, "state": "displayed"}, {"name": "26", "path": "26", "count": 131, "state": "displayed"}, {"name": "24", "path": "24", "count": 126, "state": "displayed"}, {"name": "25", "path": "25", "count": 126, "state": "displayed"}, {"name": "22", "path": "22", "count": 116, "state": "displayed"}, {"name": "23", "path": "23", "count": 100, "state": "displayed"}, {"name": "53", "path": "53", "count": 93, "state": "displayed"}]}]}'
library( jsonlite )
dat <- fromJSON( web.text )
class( dat )  # list
names( dat )  # "nhits", "parameters", "records", "facet_groups"

library( data.table )
records <- dat[["records"]]
fields <- sapply( records, '[', 'fields' )
data.table::rbindlist( fields )

dt <- data.table::rbindlist( fields )
df <- as.data.frame( dt )

knitr::kable( df )
owid taxa periodo indicador valor_absoluto numero_da_semana dia_da_semana
27878 0.0106383 2019-05-19 Encaminhamentos p/ CIAV 34 21 domingo
27876 0.1939925 2019-05-19 Encaminhamentos p/ CSP 620 21 domingo
27875 0.2193367 2019-05-19 Encaminhamentos p/ Autocuidados 701 21 domingo
27879 0.0472466 2019-05-19 Encaminhamentos p/ INEM 151 21 domingo
27877 0.3501252 2019-05-19 Encaminhamentos p/ Serviço Urgência 1119 21 domingo
27880 0.1786608 2019-05-19 Encaminhamentos p/ Outros 571 21 domingo
27844 0.2480436 2019-05-18 Encaminhamentos p/ Autocuidados 729 20 sábado
27848 0.0483158 2019-05-18 Encaminhamentos p/ INEM 142 20 sábado
27845 0.1548146 2019-05-18 Encaminhamentos p/ CSP 455 20 sábado
27849 0.1874787 2019-05-18 Encaminhamentos p/ Outros 551 20 sábado
aperaltasantos commented 5 years ago

Thank you for the help, I solve the issue

the code is bellow Some comments I added to the URL &rows=-1 to have all rows in the database, and also did some tricks with the large list.

library(rjson)
library(jsonlite)
library(RCurl)
library(tidyverse)
# Base URL path
base_url = "https://transparencia.sns.gov.pt/api/records/1.0/search/?dataset=atividade-de-prestacao-do-sns-24&rows=-1&sort=periodo&facet=periodo&facet=indicador&facet=dia_da_semana&facet=numero_da_semana"
# encode the URL with characters for each space.
base_url <- URLencode(base_url)

# Convert JSON to data frame
sns24 <- fromJSON(getURL(base_url))
sns24 <- as.data.frame(sns24$records)
sns24 <- as.data.frame(sns24$fields)