Inference step may not enable general "inference mode" usage

raehik commented 10 months ago

This project is architected in three "steps":

the data step which generates forcings from the CM2.6 dataset to train on
the training step which trains a Pytorch neural net using generated forcings
the inference step (recently renamed from the testing step #92) which processes some unseen data on a trained neural net

Though the inference step loads the model in "inference mode" (net.eval()), it's fairly inflexible. Pytorch dataloaders are filled using parameters from the training step run (via MLflow). It's not currently possible to load arbitrary (CM2.6) data and run it through the model to obtain a prediction.

The original spec for this project may be expecting the inference step to be more general. (Renaming the testing step may have caused misunderstanding.) We should clarify what "inference mode" features we want to provide, and how we might adapt the inference step for that.

raehik commented 10 months ago

discussed with @dorchard today (2023-10-17)

raehik commented 10 months ago

If we were able to run an arbitrary pretrained net on a user-specified subset of CM2.6, would that match up to the specification/requirements? I'm a little unclear what the output would be. The predicted forcings in each cell, for each time point? There should be some relevant visualization code somewhere. Might be in a Jupyter notebook.

raehik commented 10 months ago

A quick note: the NN inputs are ocean velocities, the outputs are predicted forcing terms.

raehik commented 9 months ago

I think the core issue here is the naming of the inference step. It should be renamed back to "testing", and a separate "inference" script introduced which does a little less and allows loading pretrained models. That's tackled in #97 .

m2lines / gz21_ocean_momentum

Inference step may not enable general "inference mode" usage #98