Oracle that generates Smin samples..

This is likely a valid concern, but it is not addressed here.

However, in my experience, with several companies that use these types of decision tree surrogate models in ways that have been proven over time to be effective on real data, trepan and other more specific surrogate model extraction procedures are not used. What is done is training one surrogate model - as done here - on some hold-out data set, then evaluating that surrogate model across folds for stability. If error metrics are adequate on the single partition and across folds during cross-validation they use the surrogate model.

h2oai / mli-resources

Oracle that generates Smin samples.. #6