amanteur / SCNet-PyTorch

Unofficial PyTorch implementation of "SCNet: Sparse Compression Network for Music Source Separation"
MIT License
42 stars 1 forks source link

Could not find the monitored key in the returned metrics #2

Closed ari-ruokamo closed 5 months ago

ari-ruokamo commented 5 months ago

Beginner ML-question here - but on my train run metrics/validation phase seem to fail at the end of the epoch. I've reduced the training sources to "vocals" and "other".

File "/home/arijr/miniconda3/envs/scnet/lib/python3.10/site-packages/lightning/pytorch/callbacks/model_checkpoint.py", line 382, in _save_topk_checkpoint raise MisconfigurationException(m) lightning.fabric.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='val/usdr') could not find the monitored key in the returned metrics: ['lr-Adam', 'train/loss', 'grad_2.0_norm_total', 'val/loss', 'usdr_other', 'usdr_vocals', 'usdr', 'epoch', 'step']. HINT: Did you call log('val/usdr', value) in the LightningModule?

amanteur commented 5 months ago

Hey!

Thank you for the issue!

It was actually a bug, 'val' was not appended to logged metrics, so at the end of the epoch ModelCheckpoint could not find it. I fixed that, and now it should work!