Closed alexislitvine closed 2 months ago
@mittagessen - I am trying to train a segmentation model, but I get the following error each time:
Trainable params: 1.3 M Non-trainable params: 0 Total params: 1.3 M Total estimated model params size (MB): 5 stage 0/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 179/179 0:06:03 • 0:00:00 0.46it/s val_accuracy: 0.917 early_stopping: 0/10 -inf val_mean_acc: 0.917 val_mean_iu: 0.121 val_freq_iu: 0.37 Error: │ /local/filespace/kraken/venv/lib/python3.10/site-packages/kraken/lib/train.py:192 in │ │ on_validation_end │ │ │ │ 189 │ """ │ │ 190 │ def on_validation_end(self, trainer: "pl.Trainer", pl_module: "pl.LightningModule") │ │ 191 │ │ if not trainer.sanity_checking: │ │ ❱ 192 │ │ │ trainer.model.nn.hyper_params['completed_epochs'] += 1 │ │ 193 │ │ │ metric = float(trainer.logged_metrics['val_metric']) if 'val_metric' in trai │ │ 194 │ │ │ trainer.model.nn.user_metadata['accuracy'].append((trainer.global_step, metr │ │ 195 │ │ │ trainer.model.nn.user_metadata['metrics'].append((trainer.global_step, {k: f │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ KeyError: 'completed_epochs'
I used: $ ketos segtrain -f page -N 100 -q early --min-epochs 50 -d cuda:0 -o BL_27042024 -t output.txt --suppress-baselines ━━
I've tagged a new 5.2.3 release with a hotfix. There's another small bug with resizing the output layer when fine-tuning that I'll get to tomorrow.
Works fine - amazing!
@mittagessen - I am trying to train a segmentation model, but I get the following error each time:
I used: $ ketos segtrain -f page -N 100 -q early --min-epochs 50 -d cuda:0 -o BL_27042024 -t output.txt --suppress-baselines ━━