openclimatefix / PVNet

PVnet main repo
MIT License
15 stars 3 forks source link

Not all model config values logged accurately in weights & bias UI #168

Closed Sukh-P closed 4 weeks ago

Sukh-P commented 3 months ago

Describe the bug

Currently when using W&B as the logger when training a model using PVNet not all config parameters seem to show accurately in the W&B UI, some show as null even when they are non null.

See example screenshot:

Screenshot 2024-03-26 at 10 42 39

To Reproduce

Train a model using PVNet and use W&B as the logger and look at the config pane on the overview page of the run and compare the config that was actually used (should be saved in a local file when running PVNet).

Expected behaviour

Would expect W&B to log all the model config accurately so that comparisons can be made between different runs with different configs (not that currently a local config parameter file is produced so this information is not lost but would be favourable

Additional context

This seems to only happen for section of the config which have formats such as this in the multimodal.yaml file, where _target_ is the first key on the second level. Most likely is an issue with W&B parsing config in this specific format

peterdudfield commented 4 weeks ago

Good to note its not a wandb issue as this works

import wandb

wandb_logger = dict(_target_="lightning.pytorch.loggers.wandb.WandbLogger",
                    project="wandb_test",
                    name="test_mode",
                    save_dir="wandb_test",
                    offline=False,  # set True to store all logs only locally
                    log_model=False,
                    job_type="train",
                    optimizer = dict(_target_="torch.optim.Adam", lr=0.1)
                    )

experiment = wandb.init(config=wandb_logger)
experiment.log({"loss": 0.1})

It could be a pytorch lightning issue, or a hydra issue

peterdudfield commented 4 weeks ago

Also doesnt seem to be a problem with pytorch lightning

logger = WandbLogger(
    project="wandb_test",
    name="test_mode",
    save_dir="wandb_test",
    offline=False,  # set True to store all logs only locally
    log_model=False,
    job_type="train",
)
logger.log_hyperparams({"optimmizer":{"_target_":"torch.optim.Adam","lr": 0.1}})
logger.experiment.log({ "loss": 0.1})

works, so my guess is its a hydra issue

peterdudfield commented 4 weeks ago

It looks like its happens in

OmegaConf.save(config.model, f"{callback.dirpath}/model_config.yaml")

in pvnet/training.py

Edit: I'm not sure it in here now

peterdudfield commented 4 weeks ago

The file seems to appear once the trainer.fit phase has begun

peterdudfield commented 4 weeks ago

Whats also weird, is this config seems to save just the just the model configs, but not other configs

peterdudfield commented 4 weeks ago

Ill close this now, as the PR #217 should fix this