dmlc / XGBoost.jl

XGBoost Julia Package
Other
288 stars 111 forks source link

Changes to predict that allow specification of predict 'type' #170

Closed bobaronoff closed 1 year ago

bobaronoff commented 1 year ago

Theses changes would allow a user access to a feature of libxgboost that reports feature contributions and/or interactions on the record level. This can be useful for Shapley type analyses. The additional data is obtained via specification of the 'type' parameter in Lib.XGBoosterPredictFromDMatrix().

A 'type' parameter is added to predict(); Input values are 0 through 6. The meaning associated with each parameter value is added to the docstring. This is an optional parameter with a default of 0 which provides normal output.

Was not certain what to do with old parameter 'margin' - I removed it as is redundant to the 'type' specification although might cause issue if others have it in and code.

There is significant variation in the dimensions of data returned dependent on the 'type' and booster objective ( multi class objectives return and extra dimension). 'type' 2 and 3 return 2 dimension array. 'type' 4 and 5 return 3 dimensional array. transpose() fails on 3 dimensional array and is replaced with permutedims(). This creates a trade-off in that permutedims() reallocates memory for array although the Matrix Type is more robust than the Transpose Type. For normal prediction(i.e. 'type'=0 where return is vector), there is no additional allocation so this should not impact operations that call predict many times ( for the creation of learning curves/cross validation).

bobaronoff commented 1 year ago

I apologize for how messy the changes look. On my end the comparison was cumulative and not broken up with the 6 tiny commits. This is my first PR ever. Hopefully there is a way you can see the final result compared to master. Bob