Closed PiotMik closed 2 years ago
For categorical features the problem is straightforward - use dummy encoding.
For ordinal features it is more complicated:
For sure we can use ordinal encoding for all "passenger vote" variables.
But the same may fail for the Class
feature, since e.g. higher expectations of Business Class passengers may skew the satisfaction score down, even though the journey quality is higher. For this reason dummy encoding will be used for Class
.
The choice of ordinal encoding for "note" features should be cross-checked for particular models during #4 .
Resulting dataframe:
Dataset consists of many categorical and ordinal features. Think through an appropriate encoding, apply and describe the idea in
.Rmd