GeoscienceAustralia / uncover-ml

Machine Learning system for Geoscience Australia uncover project
Apache License 2.0
30 stars 20 forks source link

Pickled config file in model is breaking tests #67

Closed brenmous closed 4 years ago

brenmous commented 4 years ago

The '.model' output of the learn step is a pickle file containing the trained model object and the config object.

If the config file contains environment variables, these will be parsed during the learn step and then provided to the predict step.

This causes an issue as I want to test predict against a precomputed model. When this test is run on another platform (e.g. CircleCi), predict loads the config from the model which contains paths for the platform that the config was parsed on and so files can't be found.

Another issue this may potentially cause is that if a user wants to modify parameters before running predict (e.g. prediction mask), they have to rerun learn and retrain the model unnecessarily so that the config can be repackaged with the model.

zhang01GA commented 4 years ago

Yes. I also discovered that the .model contains detailed config file info when I wrote the metadata profiling module script. The design is problematic and needs to be reviewed.

brenmous commented 4 years ago

I think the best solution is to pass the config file as an argument to predict, and from the output directory parameter (or some new parameter specific for the model) it can locate the trained model.

This cuts down on the chance of a user providing a model but a modified config with different algorithm etc (I assume this was the reason for the original design) but still allows the config to be parsed individually at each step.

zhang01GA commented 4 years ago

Need to figure out the dependencies of .model on config params. The model can be used with certain constraints such as you said algorithm, and what else?

brenmous commented 4 years ago

Looking at the top level script:

brenmous commented 4 years ago

There's possibly other dependencies on the config. My concern is that saving the config in this manner has been done to preserve state - but if so much state needs to be preserved then learn and predict shouldn't be separate steps. We need to work on not only fixing this config issue but also the fundamental problem of predict relying on state that isn't included in the trained model.

brenmous commented 4 years ago

Working on this now - solution is to add a specific model field under the output block. This will specify where it gets saved as part of the learn and cluster commands, and where it will be read from as part of the predict command.

This unlocks the config from the model, allowing prediction-specific changes to be made before predict is rerun. It also allows precomputed models to be passed around - the user just needs to ensure they change that paths in the config (environment variables help with this).