frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
481 stars 107 forks source link

Remove enum-labels-and-ordering recipe #958

Open peterdesmet opened 2 days ago

peterdesmet commented 2 days ago

I believe the Enum Labels and Ordering recipe is largely (or completely) implemented in v2.

From a maintenance perspective, I think it is better to remove it completely. @pschumm @khusmann what do you think?

khusmann commented 2 days ago

I agree, I think it can safely be removed, and it would be a good idea to avoid confusion.

For future reference I'll mention here that the only capability of the v1 pattern not implemented in v2 is the ability to label non-categorical values. To quote @pschumm 's comment on the original v2 PR:

I would point out that there is one possible use of value labels or formats (only relevant for Stata, SAS or SPSS) that this does not accommodate; namely, the case where you want to label only a few values that are not missing values but you don't want to have to enumerate all possible values in the schema. For example, you might have a top-coded age variable where you want to label the value 90 with "90 or older" but you don't want to have to enumerate all of the integers between 1-90. This may or may not be something you want to treat as categorical in your analyses, and so I don't really see this as a limitation in the way you've defined a categorical type. But it does prevent you from packing all of the information you need to define your value labels or formats into the schema.

We ended up deciding that this capability was orthogonal to the definition of categoricals, even though these concepts are confounded in the SPSS / SAS / Stata implementations of value labels. So we figured if there's demand for it later, we can add this feature as a separate concern.