Open sweingwifo opened 9 months ago
Thank you for opening this issue. The issue seems to be more about outdated / badly worded documentation than actual error. By following the function example, i.e. by using a data object as input to argument x
and leaving argument dic
as NULL
(If NULL (default) dictionary names taken from column names of the data_frame):
lp <- get_eurostat("nama_10_gdp")
lpl <- label_eurostat(lp)
things worked fine for me. If you have any other errors please don't hesitate to bring them up here or open another issue.
We will clarify the documentation in the next update.
Thank you for your prompt response and for looking into the issue.
Upon further assessment, I have noted that while labeling of datasets does work as described when passing a data object to the label_eurostat function, there seems to be a change in the behavior compared to previous versions regarding the use of a string input.
In the past, it was possible to use the function by directly providing a string representing a eurostat code.
However, this functionality no longer appears to work in the current version, resulting in the '404 error Not Found'. This change has impacted certain workflows that relied on string inputs for dataset labeling.
Would it be possible to reinstate this feature, or should we adjust our workflows to only use data objects as input?
Eurostat removed old "dictionaries" (.dic files) when the old bulk download service was decommissioned. These were basically just lookup tables that had the code in the 1st column and the definition (label) on the 2nd column. The alternative to using this big list is described in Eurostat document "API - Migrating from Bulk Download Listing urls to API urls: "Retrieve all dataflows "stubs"(only references and title) in XML".
While by modern standards it's not a lot, I was a bit hesitant on writing software that relies on fetching a 6.5 Mb XML file or 4.2 Mb .tsv file for such a simple lookup operation. At least the old table_dic.dic file had the decency of being only ~620 Kb. Many datasets are of course much bigger than 6.5 Mb so the additional traffic to Eurostat wouldn't probably be that big of an issue... And anyway there are other options to implement this feature than to download all the variable labels, such as fetching metadata based on dataset name as it was done before.
Although, are you certain that label_eurostat_tables("nama_10_gdp", lang = "en")
wouldn't serve the purpose? Judging by the contents of table_dic it just returns the name (label) of the dataset?
Here's the old table_dic file for reference: table_dic.dic.zip
Thanks for the detailed explanation and the context.
The label_eurostat_tables function does indeed work for retrieving the name of a single dataset, and it serves my purpose for individual codes. I apologize for any confusion; my use case actually involves working with a vector of dataset codes, which is why I used the functionality that accepts a string vector input. The help file suggests that vectors can be used as input, it seems that this is not currently supported.
While it's not as convenient, I can iterate over my vector of dataset codes using the label_eurostat_tables function to get the labels. This will be a bit more time consuming than the previous method, but it is a workaround.
Thank you again for your support and for considering the reinstatement/adjustment of this feature.
While it's not as convenient, I can iterate over my vector of dataset codes using the label_eurostat_tables function to get the labels. This will be a bit more time consuming than the previous method, but it is a workaround.
I don't think my way of solving this would be much different from what you describe here. If you have the codes already in a vector it's relatively straightforward to label them e.g. by using sapply:
codes <- c("NAMA_10_GDP", "NAMA_10_LP_A21", "NAMA_10_FTE")
names <- sapply(codes, label_eurostat_tables)
> names
NAMA_10_GDP
"GDP and main components (output, expenditure and income)"
NAMA_10_LP_A21
"Labour productivity and unit labour costs at industry level"
NAMA_10_FTE
"Average full time adjusted salary per employee"
Description
The label_eurostat function is currently failing and returns a 404 error Not Found, suggesting there could be an issue with incorrectly specified paths in the latest update of the Eurostat package.
Steps to reproduce
label_eurostat("nama_10_gdp", dic = "table_dic")
Expected Behavior
The function is expected to retrieve the labels for the dataset without any errors.
Actual Behavior
The function call results in a 404 error Not Found.