ropensci / nomisr

Access UK official statistics from the Nomis database through R.
https://docs.ropensci.org/nomisr
Other
44 stars 12 forks source link

nomis_get_data error - Can't combine `RECORD_COUNT` <double> and `RECORD_COUNT` <character> #27

Closed JoannaWatson closed 2 years ago

JoannaWatson commented 2 years ago

Hi, I'm having trouble using some code that appeared to work fine a few months ago but is now throwing up the error in the title.

The error occurs when trying to extract claimant count data, code as follows:

Claimant_count <- nomis_get_data(id = "NM_162_1", 
                                 #time = timesel,
                                 time = "latest",
                                 geography = "TYPE432", #district/UA as of April 2021
                                 measures= 20100,
                                 tidy = TRUE,
                                 select = c("DATE", "DATE_NAME", "GEOGRAPHY_NAME", "GEOGRAPHY_CODE", "GENDER_NAME", "MEASURE_NAME", "OBS_VALUE", "RECORD_COUNT"))

It seems to affect any variable that is of type and I wondered if there had been a change to the data type in the underlying nomis data set?

evanodell commented 2 years ago

I'll look into this. Could be an underlying change in the data, could be from rate limiting kicking in if the query is larger than before.

ammar-gla commented 2 years ago

I have the same issue with the claimant count dataset (though my initial error was for DATE_TYPECODE) with code that worked fine last month. I now also get a warning that I am trying to access more than 375000 rows, which requires a manual interaction with RStudio, whereas there were no breaks beforehand.

I wrote to Nomis initially who responded that there have been no changes to the dataset, but that they "are aware of a problem with API downloads using the data.json output format". I am quite new to R so I have not been able to figure out whether Nomisr uses data.json or something else, but I hope this may be useful information to you.

evanodell commented 2 years ago

Thanks @ammar-gla. What was the query that prompted this warning? nomis_get_data queries for CSVs, not data.json files, but it could be a problem with both, or the Nomis API relying on data.json data for csv formatted queries.

I broke down @JoannaWatson's query and it returns some empty pages, which is what causes the"Can't combine RECORD_COUNT <double> and RECORD_COUNT <character>" error. However the query shouldn't be retrieving empty pages, and the result is 218 rows, which if correct is far too small to be causing pagination issues. I'll keep investigating.

ammar-gla commented 2 years ago

I used the query below. Though the error curiously does not happen if I only use one of the geographies instead of several, the data ends up being incomplete as it only retrieves some of the dates between Dec-2019 and the latest date, whereas it previously retrieved all dates.

Group <- c(2013265927,2092957697,1811939540,1811939541,1811939542,1811939543,1811939544,1811939526,1811939527,1811939545,1811939546,1811939547,
           1811939548,1811939528,1811939529,1811939530,1811939549,1811939550,1811939551,1811939552,1811939531,1811939532,1811939553,1811939533,
           1811939534,1811939554,1811939535,1811939555,1811939556,1811939536,1811939557,1811939537,1811939558,1811939538,1811939539)

#Retrieve claimant count data and receive error
claimant_count_stats <- nomis_get_data(
  id = "NM_162_1", 
  geography = Group,
  time = c("2019-12","latest")) 
ammar-gla commented 2 years ago

Hi @evanodell, just to follow up on this, I tried the same code this morning and it now seems to work perfectly fine. I am not sure whether the issue was on my end (though I changed nothing from last week) or at Nomis. Thanks for looking into it!

JoannaWatson commented 2 years ago

Hi @evanodell, I've just tried to run my previous code and it is now working so I think Nomis must have changed something Thank you so much for looking into this for me.