Three minor bugfixes for data_config.yaml workflow

mllam / neural-lam

Neural Weather Prediction for Limited Area Modeling

MIT License

64 stars 24 forks source link

Three minor bugfixes for data_config.yaml workflow #40

Closed sadamov closed 4 weeks ago

sadamov commented 1 month ago

Summary

https://github.com/mllam/neural-lam/pull/31 introduced three minor bugs that are fixed with this PR:

coords_projection is a cached_property and as such is not callable (removed ())
r"" strings are not required in units of data_config.yaml
dictionaries cannot be passed as argsparse, rather JSON strings. This bug is related to the flag var_leads_metrics_watch

leifdenby commented 1 month ago

While you are doing this I thought of something else with the config: Shouldn't num_forcing_features be derived as the length of the dataset.var_names list in the config https://github.com/mllam/neural-lam/blob/main/neural_lam/data_config.yaml#L57? I mean rather than being defined separately.

sadamov commented 1 month ago

Hmm, the dataset.var_names are of the state not of the forcings though. In the future when our yaml file covers state,forcings and boundaries I suggest something along these lines https://github.com/mllam/neural-lam/blob/feature_dataset_yaml/neural_lam/data_config.yaml

      self.grid_dim = (
            2 * self.config_loader.num_data_vars("state")
            + grid_static_dim
            + self.config_loader.num_data_vars("forcing")
            * self.config_loader.forcing.window
        )

Note, that currently the forcings are calculate in parts in the dataset and in otherparts loaded from .npy files.

leifdenby commented 1 month ago

Hmm, the dataset.var_names are of the state not of the forcings though.

Ah yes, of course you're right! But but isn't num_forcing_features == 16 too large then? I thought the forcing features were only radiation etc.

Also, I really like your suggestion about introducing a num_data_vars representation somewhere. Should I start a separate issue for this so we can discuss it in more detail?

sadamov commented 1 month ago

the forcings should be GRID_FORCING_DIM = 5 * 3 + 1 # 5 feat. for 3 time-step window + 1 batch-static (from original constants.py)

=16

land sea mask (batch-static)
top of atmosphere radiation (window)
sin hour of day (window)
cos hour of day (window)
sin day in year (window)
cos day in year (window)

sadamov commented 1 month ago

num_data_vars is already implemented in the Config class: https://github.com/mllam/neural-lam/blob/5b71be3c68d815e0e376ee651c14f09d801f86de/neural_lam/config.py#L51

joeloskarsson commented 1 month ago

The coords_projection problem was fixed by merging #17, so merged main in here so we don't see that change.

joeloskarsson commented 4 weeks ago

Did not add explicitly to changelog as these are only fixes to #23