Support converting LightGBM pickle file to CSV files and loading LightGBM model from CSV files
Dispatch predict() function according to user input (withMissing/withoutMissing values in data); the dispatching speeds up 1600 trees Randomforest model on Higgs dataset.
Change the aggregation functions for all algorithms. I believe the aggregation function is wrong, so I changed it. The details are described in my previous document. The basic idea is simple, for xgboost and lightgbm, the aggregated predicted value is just the sum of all predicted leaf values. For random forest, the aggregated predicted value is the mean of all predicted leaf values. If it is a classification task, the forest outputs class one if the aggregated value is greater than 0.5 and class zero otherwise.
Enable regression tasks and support the "Year" dataset.
Support loading SVM data input to dense matrices.
Some code refactoring:
Change returnClass in data structures to a union of threshold (for inner nodes) and leafValue (for leaf nodes).
Refactor the function that loads the matrix from an input file to make it easier to understand.
Main updates:
Some code refactoring: