icenet-ai / icenet

The icenet library is a pip installable python package containing the commands and code you need to produce forecasts
MIT License
21 stars 7 forks source link

`generate_workers` missing on icenet_train #235

Open JimCircadian opened 6 months ago

JimCircadian commented 6 months ago

Description

Running training, the parameter generate_workers for the dataset isn't necessarily going to be in place, so we should account for this.

What I Did

icenet_train -b 4 -e 5 -f 3 -n 1.44 -s mirrored --gpus 6 -nw --lr 25e-5 \
  exp23_south test_south1 42 \
  2>&1 | tee logs/test_south1.log

Traceback (most recent call last):
  File "/home/dn-byrn1/.conda/envs/icenet/bin/icenet_train", line 33, in <module>
    sys.exit(load_entry_point('icenet', 'console_scripts', 'icenet_train')())
  File "/rds/user/dn-byrn1/hpc-work/icenet/icenet/icenet/model/train.py", line 370, in main
    dataset = IceNetDataSet("dataset_config.{}.json".format(args.dataset),
  File "/rds/user/dn-byrn1/hpc-work/icenet/icenet/icenet/data/dataset.py", line 79, in __init__
    self._generate_workers = self._config["generate_workers"]
KeyError: 'generate_workers'

Checking the dataset config, this value isn't available. As this is a tf.data precached dataset, the invocation of the dataset for training is not passed generation parameters, so a clause allowing this to be missing is needed:

    if len(args.additional) == 0:
        dataset = IceNetDataSet("dataset_config.{}.json".format(args.dataset),
                                batch_size=args.batch_size,
                                shuffling=args.shuffle_train)
JimCircadian commented 6 months ago

Sticking a value in for generate_workers into dataset_config...json sorted this out. The configuration in use is doubtless quite old (probably 0.2.6 or 7) but we should be defending against missing parameters. Also, this parameter shouldn't be applicable in this situation anyway.