Open rideofyourlife opened 8 months ago
This is intentional behaviour but if many people find this annoying you can report it under this issue and we can reconsider.
Statistical agencies worldwide have similar standards treating metadata, and metadata in this case is there to avoid unforeseen logical errors when joining or linking data; I think that the freq variable is present when there are similar statistical products or datasets available with the same variables but different frequencies. In that case a joining without frequency adjustment results in a hard to find logical error. The freq variable is the same as the unit variable, you really want to avoid unknowingly divide euros with thousand euros, or multiply annual values in a chain with quarterly values.
Statistical agencies worldwide have similar standards treating metadata, and metadata in this case is there to avoid unforeseen logical errors when joining or linking data;
Well, we are all aware. At least I hope so it is the case.
In that case a joining without frequency adjustment results in a hard to find logical error. The freq variable is the same as the unit variable, you really want to avoid unknowingly divide euros with thousand euros, or multiply annual values in a chain with quarterly values.
This would assume users are somewhat unaware of what they are doing. It seems to me that implementation of this technique is triumph of form over content.
@rideofyourlife I have uploaded some WIP code in v4.1 branch. It enables users to make queries the same way as before but adds an additional parameter legacy.data.output
to get_* functions that transforms dimensions names such as TIME_PERIOD
and OBS_VALUE
to time
and values
that were used before and removes extra columns such as freq
, DATAFLOW
and LAST UPDATE
altogether.
If you could test this and give some feedback on what you think it would be great!
I have already laboriously replaced "time" with "TIME_PERIOD" in all my codes, so having "time" back is not as essential now as it had been before the recent change. Despite that, where do I use this legacy.data.output
? In which function?
I'm sorry for the laborious process. In version 4.1 legacy.data.output = TRUE
parameter in get_eurostat()
function should return a similar data.frame / tibble as it returned in version 3.8.3 and before.
Ah, yes: it works. It is just not suggested by R Studio while writing for some reason.
Many datasets, which contain only one frequency available (like _namq_10gdp, _sts_inprm etc.), were awarded a new variable "freq". I generally understand the idea behind it, but while working on the package it has only proven to often be an unnecessary step of
%>% select (-freq)
in majority of the code I write.Does anyone else have similar thoughts?