Optimizing Shapley calculations to run faster. Two main changes:
BaggedModel.scala:
The .shapley. method in BaggedModel is calling .shapley on each model in the ensemble and then summing the results. This approach is optimized by:
parallelizing (par) Shapley calculation for each model.
Simplify the reduction step by directly adding matrices instead of using a separate reducer function.
Avoid creating intermediate Option objects during reduction.
ModelNode.scala
Small optimizations:
Precompute hotPortion and coldPortion in InternalModelNode to avoid repeated calculation.
These two changes brought the time from ~30s down to ~7s for a dataset of 100 points. (tested locally)
Optimizing Shapley calculations to run faster. Two main changes:
BaggedModel.scala
: The.shapley.
method inBaggedModel
is calling.shapley
on each model in the ensemble and then summing the results. This approach is optimized by:par
) Shapley calculation for each model.Option
objects during reduction.ModelNode.scala
Small optimizations:hotPortion
andcoldPortion
in InternalModelNode to avoid repeated calculation.These two changes brought the time from ~30s down to ~7s for a dataset of 100 points. (tested locally)