list index out of range when using child_first_train_epochs

sun-yitao commented 5 years ago

When setting child_first_train_epochs (I tried with 15 and 20), the following error occurs after training 'child_first_train_epochs' number of epochs:

Traceback (most recent call last):
  File "run_deepaugment.py", line 51, in <module>
    deepaug = DeepAugment(images=x_train, labels=y_train.reshape(TRAIN_SET_SIZE, 1), config=my_config)
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/lib/decorators.py", line 106, in wrapper
    return func(*args, **kwargs)
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/deepaugment.py", line 120, in __init__
    self._do_initial_training()
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/deepaugment.py", line 202, in _do_initial_training
    -1, ["first", 0.0, "first", 0.0, "first", 0.0, 0.0], 1, None, history
  File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/deepaugment/notebook.py", line 38, in record
    new_df["B_aug2_magnitude"] = trial_hyperparams[7]
IndexError: list index out of range

Here is my config used:

my_config = {
    'model': 'wrn_16_2',
    'train_set_size': int(TRAIN_SET_SIZE*0.75),
    'child_epochs': 60,
    'child_batch_size': 64,
    'child_first_train_epochs': 20,
    'opt_samples': 1,
}

Where TRAIN_SET_SIZE is a custom dataset of 3000 examples The code runs fine if I omit the child_first_train_epochs setting

JordanMakesMaps commented 5 years ago

The validation_set_size by default is set to 1000, and that 1000 comes from the initial training data you provide. So because (TRAIN_SET_SIZE * .75) == 2250, that means that only 750 is available for the validation portion of the data which results in an error. So you might need to decrease the number you provide to train_set_size 🤔

I'm also having issues with that variable and the amount provided to train_set_size. Hope it works though!

sun-yitao commented 5 years ago

I have tried train_set_size = TRAIN_SET_SIZE-1000, but it doesn't work. From my understanding of build_features.py, the 1000 validation images are sampled from the remaining images not used in training:

def sample_validation_set(data):
    val_seed_size = len(data["X_val_seed"])
    ix = np.random.choice(range(val_seed_size), min(val_seed_size, 1000), False)    
    X_val = data["X_val_seed"][ix].copy()
    y_val = data["y_val_seed"][ix].copy()
    return X_val, y_val

The error seems to come from deepaugment.py _do_initial_training method:

    self.notebook.record(
        -1, ["first", 0.0, "first", 0.0, "first", 0.0, 0.0], 1, None, history
    )

The trial hyperparams list passed to the notebook object is len(7) but the notebook.record method will attempt to access all the way until the 19th index:

new_df["E_aug1_type"] = trial_hyperparams[16]
new_df["E_aug1_magnitude"] = trial_hyperparams[17]
new_df["E_aug2_type"] = trial_hyperparams[18]
new_df["E_aug2_magnitude"] = trial_hyperparams[19]

Hence the list index out of range error, when I change the _do_initial_training to the following then it won't throw an error

self.notebook.record(
     -1, ["first", 0.0] * 10, 1, None, history
)

barisozmen / deepaugment

list index out of range when using child_first_train_epochs #27