rOpenGov / eurostat

R tools for Eurostat data
http://ropengov.github.io/eurostat
Other
234 stars 46 forks source link

Update get_eurostat_json.R #262

Closed ake123 closed 1 year ago

ake123 commented 1 year ago

Return informative get_eurostat_json error messages to the user #261

pitkant commented 1 year ago

It seems that the problem described in #260 is now solved. Great. However, I couldn't use examples mentioned there to debug this new code. Boo.

I could do something by providing a wrong parameter to the API. E.g. this erroneous query (it uses fromTimePeriod instead of the correct sinceTimePeriod):

get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))

Procudes this url: https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/TIPSHO40?fromTimePeriod=2020&untilTimePeriod=2021&format=JSON&lang=EN Which produces the following error:

{ "error": [{"status": 400,"id": 150,"label": "INVALID_QUERY_DIMENSION: Query is invalid as per its structure's definition. Dimension \"FROMTIMEPERIOD\" is not defined"}]}

When I do the query in R, I get the following output:

get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Request failed [400]. Retrying in 1.7 seconds...
Request failed [400]. Retrying in 4 seconds...
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  The requested url cannot be found within the get_eurostat_json function:
                  Client Error - 100 No results found 
                  Description - The requested resource is not available. https://ec.europa.eu/eurostat/api/dissemination/statistics/1.0/data/TIPSHO40?fromTimePeriod=2005&untilTimePeriod=2007&format=JSON&lang=EN

So it actually gets the HTTP status correctly in the old code ("Request failed [400]") but it prints a wrong message after it ("Client Error - 100 No results found").

Could you fix the code contribution so that

  1. The http request would read and save the error message (http status, id and label) from the response.
  2. The message would return the status, id and label that were saved in the memory. Now it's returning hardcoded messages.
ake123 commented 1 year ago

Solved! Feel free to comment!

pitkant commented 1 year ago

Thanks, it works quite nicely now:

> get_eurostat("TIPSHO40", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Request failed [400]. Retrying in 1 seconds...
Request failed [400]. Retrying in 3.1 seconds...
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  Status : 400, error id : 150, label : INVALID_QUERY_DIMENSION: Query is invalid as per its structure's definition. Dimension "FROMTIMEPERIOD" is not defined

Maybe it would be nice to add error message and error id labels in writing as well? Such as 400 Bad request, 100 No results found, 140 Syntax error, 150 Semantic error etc. These would make it easier to google for help and find Eurostat website.

Also, when querying for a dataset that does not exist / is not otherwise available maybe it would be helpful to print the old note stored in msg object:

   msg <- ". Some datasets are not accessible via the eurostat
          interface. You can try to search the data manually from the comext
      database at http://epp.eurostat.ec.europa.eu/newxtweb/ or bulk
      download facility at
      http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing
      or annual Excel files
      http://ec.europa.eu/eurostat/web/prodcom/data/excel-files-nace-rev.2"

Especially if in some special cases when content(resp)$error is empty / does not exist:

get_eurostat("", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors,  : 
  Status : , error id : , label : 

So there could be a check for that as well.

These examples could also be added to eurostat package tests.

ake123 commented 1 year ago

Now updated with the recommended changes

ake123 commented 1 year ago

The error message is now changed to

> get_eurostat("", filter = list(fromTimePeriod = 2005, untilTimePeriod = 2007))
Error in get_eurostat_json(id, filters, type = type, stringsAsFactors = stringsAsFactors, :
Failure to get data. Status code: 404 Some datasets are not accessible via the eurostat
interface. You can try to search the data manually from the comext
database athttps://ec.europa.eu/eurostat/comext/newxtweb/ or API data query at
https://wikis.ec.europa.eu/display/EUROSTATHELP/API+Statistics+-+data+query
pitkant commented 1 year ago

Alright, thank you for your work, I'll merge the PR