imbs-hl / ranger

A Fast Implementation of Random Forests
http://imbs-hl.github.io/ranger/
765 stars 190 forks source link

Accept non-numeric input in standalone C++ version #271

Open JonoDW opened 6 years ago

JonoDW commented 6 years ago

Hi

I've noticed that the C++ standalone version outputs a file of "0"s where the classifications should be. I've tried the C++ example given in the paper "A Fast Implementation of Random Forests for High Dimensional Data in C++ and R" with the same result. Is this a bug or am I doing something wrong?

mnwright commented 6 years ago

Could you post the files you used for training and prediction somewhere to reproduce?

JonoDW commented 6 years ago

The files are attached. Taken from the "iris" dataset.

The commands used were as follows ./ranger --verbose --file iris_data/iris_train --depvarname Species --treetype 1 --write ./ranger --verbose --file iris_data/iris_test --predict ranger_out.forest

Thanks

iris_test.txt iris_train.txt

mnwright commented 6 years ago

Thanks. Currently only numeric values are supported in the C++ version. Please code the Species as numeric. We should change the C++ version to accept non-numerics (I've changed the title).

JonoDW commented 6 years ago

Thanks for the feedback. It works with your advice (encoding the depvar into numeric format)