Open stathius opened 6 months ago
From a cursory look, it appears that these should be network_kwargs
, not model_kwargs
, and there doesn't seem to be nesting in the source code. These are constructed as part of:
c.network_kwargs.update(
channel_mult_noise=1,
resample_filter=[1, 1],
model_channels=128,
channel_mult=[1, 2, 2, 2, 2],
attn_resolutions=[28],
) # era5-cwb, 448x448
And passed into the training loop via:
training_loop.training_loop(
dataset, dataset_iter, valid_dataset, valid_dataset_iter, **c
)
And matched for:
def training_loop(
dataset,
dataset_iterator,
validation_dataset,
validation_dataset_iterator,
*,
task,
run_dir=".", # Output directory.
network_kwargs={}, # Options for model and preconditioning.
...
I think the problem comes from how the regressor is saved on disk. It might have been solved already but some of the checkpoints I have been working on had this issue.
Version
0.7.0a
On which installation method(s) does this occur?
Docker, Source
Describe the issue
Analysis: The error comes because
model_kwargs
are double nested when passed to the U-Net constructor:{'model_kwargs': {'embedding_type': 'zero', 'encoder_type': 'standard', 'decoder_type': 'standard', 'channel_mult_noise': 1, 'resample_filter': [1, 1], 'model_channels': 128, 'channel_mult': [1, 2, 2, 2, 2], 'attn_resolutions': [28], 'dropout': 0.13}}
This could be a problem with how the metadata for the regression model are saved on the disk.
Proposed fix: 1) Pass
model_kwargs["model_kwargs"]
instead ofmodel_kwargs
2) Investigate and change how metadata for regression model are saved.Minimum reproducible example
Relevant log output
Environment details