christophergandrud / imfr

R package for interacting with the IMF RESTful JSON API
49 stars 5 forks source link

Multiple values for single month, single country and one indicator #28

Open madhurmehta1 opened 3 years ago

madhurmehta1 commented 3 years ago

Hi,

I am trying to use the IMFR package to get data. Following are the codes for the data that I am trying to access,

dt1<-imf_data(database_id = "IRFCL",indicator = "RAFAO_USD", freq = 'M', country = "all", start = "1995", end = current_year()) dt2<-imf_data(database_id = "IRFCL",indicator = "RAFAFX_USD", freq = 'M', country = "all", start = "1995", end = current_year())

The problem is that for a country like Great Britain (GB), for the month of May 2008, I am gettting 9 values for one indicator and there are two different set of values. (This is the case with numerous country)

Please help me resolve this.

cjyetman commented 3 years ago

looks like the api is returning data fro multiple reference sectors

you could use return_raw = TRUE to get the raw data and extract just the data for the reference sector that you're interested in... here's a start towards that...

raw <- imf_data(database_id = "IRFCL", indicator = "RAFAFX_USD", 
                freq = 'M', country = "GB", return_raw = TRUE,
                start = "2008", end = "2008")

overview <- raw$CompactData$DataSet$Series
observations <- raw$CompactData$DataSet$Series$Obs

available_freq <- overview$`@FREQ`
available_sector <- overview$`@REF_SECTOR`

series_freq <- grep("M", available_freq)
series_sector <- grep("S1311", available_sector)
series_pos <- intersect(series_freq, series_sector)

observations[[series_pos]]
madhurmehta1 commented 3 years ago

Hi! Thank you for your response. Using the following command,

dt1<-imf_data(database_id = "IRFCL",indicator = "RAFAO_USD", freq = 'M', country = "all", start = "1995", end = current_year())

We do get multiple values, however, we get data for 80 some countries.

aw <- imf_data(database_id = "IRFCL", indicator = "RAFAFX_USD", freq = 'M', country = "all", return_raw = TRUE, start = "2008", end = "2008")

overview <- raw$CompactData$DataSet$Series observations <- raw$CompactData$DataSet$Series$Obs

available_freq <- overview$@FREQ available_sector <- overview$@REF_SECTOR

series_freq <- grep("M", available_freq) series_sector <- grep("S1311", available_sector) series_pos <- intersect(series_freq, series_sector)

observations[[series_pos]]

using the above code we only get 60 countries.

So my question is : -

1) why such a discrepancy in the number of countries? 2) I need data for the original sample of countries. How do I get that using this package?

cjyetman commented 3 years ago
  1. why such a discrepancy in the number of countries?

I looks like the package only returns the first 60 countries worth of data when using return_raw = TRUE, I would guess because it's attempting to limit the data transfer and not exceed IMF's API limit... hence the warning/message it returns Only returning data for the first 60 countries.

  1. I need data for the original sample of countries. How do I get that using this package?

You could loop through sets of 60 countries at a time and then bind the outputs together. Obviously not ideal, but it's just a suggestion.

cjyetman commented 3 years ago

Since this package does not do any filtering of @REF_SECTOR, you're probably better off doing this "manually", something like...

library(jsonlite)
library(dplyr)

url <- "http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IRFCL/..RAFAFX_USD?startPeriod=2008&endPeriod=2008"
raw_data <- fromJSON(url)

observations <- raw_data$CompactData$DataSet$Series$Obs
is_data_frame <- unlist(lapply(observations, is.data.frame))

observations <- observations[is_data_frame]
FREQs <- raw_data$CompactData$DataSet$Series$`@FREQ`[is_data_frame]
REF_AREAs <- raw_data$CompactData$DataSet$Series$`@REF_AREA`[is_data_frame]
INDICATORs <- raw_data$CompactData$DataSet$Series$`@INDICATOR`[is_data_frame]
REF_SECTORs <- raw_data$CompactData$DataSet$Series$`@REF_SECTOR`[is_data_frame]
UNIT_MULTs <- raw_data$CompactData$DataSet$Series$`@UNIT_MULT`[is_data_frame]
TIME_FORMATs <- raw_data$CompactData$DataSet$Series$`@TIME_FORMAT`[is_data_frame]

observations <- 
  lapply(seq_along(observations), function(i) {
    df <- observations[[i]]
    df$FREQ <- FREQs[i]
    df$REF_AREA <- REF_AREAs[i]
    df$INDICATOR <- INDICATORs[i]
    df$REF_SECTOR <- REF_SECTORs[i]
    df$UNIT_MULT <- UNIT_MULTs[i]
    df$TIME_FORMAT <- TIME_FORMATs[i]
    df
  })

observations <- bind_rows(observations)

observations %>% 
  filter(FREQ == "M") %>% 
  filter(REF_SECTOR == "S1311")