Should we experiment on all possible test set types?

bjschoenfeld commented 5 years ago

The test set could have pipelines from the train set, or not; it could have datasets from the train set or not. This gives us four types of test sets. Should we address them all? We currently address two: OOTS datasets/ITS pipelines and OOTS datasets/OOTS pipelines.

bjschoenfeld commented 5 years ago

We must take care that model selection/hyper-parameter tuning is based on the validation set for each problem. Model selection from one problem cannot be transferred to another.

bjschoenfeld commented 5 years ago

There are 5 test set types as diagrammed below.

             Pipelines
D   |-------------------------|
a   |    0     / |            |
t   |      /     |      3     |
a   |  /      1  |            |
s   |------------|------------|
e   |            |            |
t   |      2     |      4     |
s   |            |            |
    |------------|------------|

Type 0: Test set is the train set

Type 1: The test set contains pipelines and datasets both contained in the train set, but that particular pairing is not contained in the test set.

Type 2: The test set contains datasets not found in the train set. The pipelines are the same in both.

Type 3: The test set contains pipelines not found in the train set. The datasets are the same in both.

Type 4: The test set contains both novel datasets and novel pipelines.

Note that pipelines in the test set must be composed of primitives/algorithms contained in the train set, but there particular configuration (pipeline structure or hyper-parameters) may be novel.

byu-dml / d3m-dynamic-neural-architecture

Should we experiment on all possible test set types? #108