jonmcalder / refactor

:bookmark: Better factor handling for R
https://jonmcalder.github.io/refactor/
MIT License
3 stars 0 forks source link

additional warnings at point of factor creation #15

Closed lorenzwalthert closed 7 years ago

lorenzwalthert commented 8 years ago

The following behaviour of cfactor is undesirable:

Duplicate levels

cfactor(c("a", "b"), levels = c("a", "a", "b")) yields a warning message that duplicated factor levels are depreciated. However, they are created anyways. This should not happen. Instead, before factor is called, we should remove duplicates from levels and issue a warning, saying duplicate levels were removed.

Confounding around underlying values x, labels and levels

It is actually possible to do the following without getting a warning: factor(letters, levels = letters, labels = sample(letters)). Here, we essentially map a value x_i to an arbitrary value y_i with y_i is an element of X = (x_1, ... x_n). For most situations, this is probably not what the user wanted. It should at least cause a warning.