The following behaviour of cfactor is undesirable:
Duplicate levels
cfactor(c("a", "b"), levels = c("a", "a", "b")) yields a warning message that duplicated factor levels are depreciated. However, they are created anyways. This should not happen. Instead, before factor is called, we should remove duplicates from levels and issue a warning, saying duplicate levels were removed.
Confounding around underlying values x, labels and levels
It is actually possible to do the following without getting a warning:
factor(letters, levels = letters, labels = sample(letters)). Here, we essentially map a value x_i to an arbitrary value y_i with y_i is an element of X = (x_1, ... x_n). For most situations, this is probably not what the user wanted. It should at least cause a warning.
The following behaviour of cfactor is undesirable:
Duplicate levels
cfactor(c("a", "b"), levels = c("a", "a", "b"))
yields a warning message that duplicated factor levels are depreciated. However, they are created anyways. This should not happen. Instead, beforefactor
is called, we should remove duplicates fromlevels
and issue a warning, saying duplicate levels were removed.Confounding around underlying values
x
,labels
andlevels
It is actually possible to do the following without getting a warning:
factor(letters, levels = letters, labels = sample(letters))
. Here, we essentially map a value x_i to an arbitrary value y_i with y_i is an element of X = (x_1, ... x_n). For most situations, this is probably not what the user wanted. It should at least cause a warning.