h2oai / mli-resources

H2O.ai Machine Learning Interpretability Resources
481 stars 131 forks source link

Oracle that generates Smin samples.. #6

Open thirdeye802 opened 6 years ago

thirdeye802 commented 6 years ago

In given example a GBM is used to create tree. but TREPAN generates sample to satisfy minimum number of samples required(Smin). how does GBM do that here?

jphall663 commented 5 years ago

This is likely a valid concern, but it is not addressed here.

However, in my experience, with several companies that use these types of decision tree surrogate models in ways that have been proven over time to be effective on real data, trepan and other more specific surrogate model extraction procedures are not used. What is done is training one surrogate model - as done here - on some hold-out data set, then evaluating that surrogate model across folds for stability. If error metrics are adequate on the single partition and across folds during cross-validation they use the surrogate model.