Variable names and weights as part of static dataset files - Githubissues

mllam / neural-lam

Neural Weather Prediction for Limited Area Modeling

MIT License

102 stars 37 forks source link

Variable names and weights as part of static dataset files #3

Closed joeloskarsson closed 5 months ago

joeloskarsson commented 11 months ago

Currently the variables in the dataset are listed in constants.py. This is bad if the code is to be used with other datasets.

Proposition

Create a file variables.json in data/my_dataset/static that describe all variables. This includes:

Weather state variables (e.g. u_65)
Forcing variables for the full grid
Batch-static forcing variables (static during one forecast, but changing throughout the dataset. i.e. open water currently)

All of these should be listed in order with names. For the weather state variables, their weighting (as in parameter_weights.npy currently) should also be listed with them. We can then remove the lines https://github.com/joeloskarsson/neural-lam/blob/89a4c63370201c9ea1a5f04d4cf1e5e75b7cc83e/create_parameter_weights.py#L26-L31 that generate this weighting file. It is better to let this be something that is set manually when preparing a dataset.

Such a variables.json file could then be loaded into a VariableDescription object and used in the models. The variable dimensions https://github.com/joeloskarsson/neural-lam/blob/89a4c63370201c9ea1a5f04d4cf1e5e75b7cc83e/neural_lam/models/ar_model.py#L22-L24 should then be read from this object rather than hard-coded in a model definition.

joeloskarsson commented 5 months ago

Superseded by #23