Closed exalate-issue-sync[bot] closed 1 year ago
Nidhi Mehta commented: ref: https://support.h2o.ai/helpdesk/tickets/90970
Sebastien Poirier commented: Had a first look at this: should be able to keep StackedEnsemble for both stacking and blending strategies as the main difference is on how the level one frame is built. Will delegate this logic to some StackingStrategy class.
Erin LeDell commented: Sounds good. Let's discuss the API... I was thinking that we could have a new arg called holdout_frame to use to train the metalearner.
Sebastien Poirier commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c], I was first thinking about reusing the validation frame but now I see that it would be wrong as the holdout frame is then used for training the metalearner, so yeah... we need another one. Not much a fan of {{holdout_frame}} to be honest because validation+test are also holdout frames and I feel like this word is a bit overloaded in ML in general to mean various sets depending on the context. I'd be more explicit: {{blending_training_frame}}, {{second_level_training_frame}}, or sth in the likes.
Also, wanted to ask: I've seen two main approaches to blending:
Which one do we want to use? First approach looks more in the spirit of stacking, and I'm wondering if the second approach even makes sense for datasets with large amount of predictors. wdyt?
Erin LeDell commented: [~accountid:5b153fb1b0d76456f36daced] Yeah good point, the name "holdout" is ambiguous. Both blending_training_frame
or blending_frame
seem like good options.
Approach #1 is what we want. That's the traditional approach. I've talked to [~accountid:557058:391327fd-0326-4a45-8dcd-7a42c5142fca] and others about this and they have not seen much value in the #2 approach in practice.
Sebastien Poirier commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c] after implementing this with new blending_frame
parameter added to {{StackedEnsembleModel}}, I'm just wondering why we don't simply use {{training_frame}} as usual plus a {{stacking_mode}} param.
The fact is that when providing a {{blending_frame}}, we have absolutely no use of the {{training_frame}} itself.
The only API that would benefit from an additional frame would be {{AutoML}}... and I'm not even sure, most likely we would just split the {{training_frame}} internally and keep a split for SE. any thoughts?
Sebastien Poirier commented: The only use I can see of providing both {{training_frame}} and {{blending_frame}} to the SE model is for ensuring that all base models have been trained with a similar frame, and I'm not sure it should even be a requirement: currently, with CV-stacking we don't require this indeed, we just require that the base models have all been trained with a frame of same length, which makes sense because the {{level_one_frame}} is built from the cv-predictions on that frame, but for blending, this I don't see why we should require this.
Sebastien Poirier commented: OK, I think I found the reason why we need to keep {{training_frame}}: this is necessary for computing SE model {{training_metrics}} that are comparable to base models {{training_metrics}}. In this case, I'm leaving things as they are now with the {{blending_frame}} that acts also as a trigger for "blending mode". I'll also ensure that the {{training_frame}} passed to SE has same length as the one used to train base models as we currently do with CV stacking.
JIRA Issue Migration Info
Jira Issue: PUBDEV-4680 Assignee: Sebastien Poirier Reporter: Erin LeDell State: Closed Fix Version: 3.24.0.1 Attachments: N/A Development PRs: Available
Linked PRs from JIRA
This is the version of stacking where you don't use cv-preds to train the metalearner, but instead you score the base models on a holdout set and use those predicted values instead.
I'm not sure yet whether this should go into the existing Stacked Ensemble class, or if we should create a new one specifically for this case. The resulting model is the same though, so it should probably use Stacked Ensemble (with relaxed restrictions on the input models).
There are two main motivations here:
Once we add this, we can add support for this in AutoML as well.