Open bjschoenfeld opened 5 years ago
We must take care that model selection/hyper-parameter tuning is based on the validation set for each problem. Model selection from one problem cannot be transferred to another.
There are 5 test set types as diagrammed below.
Pipelines
D |-------------------------|
a | 0 / | |
t | / | 3 |
a | / 1 | |
s |------------|------------|
e | | |
t | 2 | 4 |
s | | |
|------------|------------|
Type 0: Test set is the train set
Type 1: The test set contains pipelines and datasets both contained in the train set, but that particular pairing is not contained in the test set.
Type 2: The test set contains datasets not found in the train set. The pipelines are the same in both.
Type 3: The test set contains pipelines not found in the train set. The datasets are the same in both.
Type 4: The test set contains both novel datasets and novel pipelines.
Note that pipelines in the test set must be composed of primitives/algorithms contained in the train set, but there particular configuration (pipeline structure or hyper-parameters) may be novel.
The test set could have pipelines from the train set, or not; it could have datasets from the train set or not. This gives us four types of test sets. Should we address them all? We currently address two: OOTS datasets/ITS pipelines and OOTS datasets/OOTS pipelines.