Closed michael135 closed 4 years ago
vtreat
uses the column type to determine the processing. A column that you consider categorical, but happens to have only values like 1
, 2
, 3
will likely be coded as numeric if it came from a csv
reader (as csv
files don't have types, the reader just guesses at types). My advice is: always examine column types using the Pandas
.dtypes
attribute and convert columns to string using a command such as data['ColumnID'] = data['ColumnID'].astype(str)
.
If the categorical column appears to have only a numeric variables (like: 5, 7, 8, 1). What is the way to specify it to
vtreat.NumericOutcomeTreatment
?Or the most simple way is to convert numeric values (categorical column) to some kind of strings?