Open dave-tucker opened 3 months ago
The in-tree models that are used in Kepler can be removed š cmd/exporter can use pkg/kepler-model-db to download the latest models if none are in the correct path
We have had a discussion about this in the community call and I think we actually can achieve this already by
IIUC, the idea behind adding this model is to support the usecase of running kepler (without estimator or model-server) on a VM without needing access to any external network.
(NOTE: all the point below are based on my limited understanding on models, training and selection. @sunya-ch please correct me if I am wrong :)
To take advantage of model-db, you will need the estimator sidecar (numpy / scikit).
For the rest of the points, I definitely see a small advantage (in terms of performance) in having a model server written in go to pick the best model and to serve them. However, models are served only once per kepler so I am not really sure if rewrite benefits us at this point in time.
Also, the model selection logic should go hand in hand with the training part, i.e. changes in metadata or features should be incorporated in the best - model selection - https://github.com/sustainable-computing-io/kepler-model-server/blob/f6990f3c0afe7320af90e47e9e91819f397b7b32/src/server/model_server.py#L73 . And for that I think it is best to have that in python itself.
What would you like to be added?
Currently we have this project https://github.com/sustainable-computing-io/kepler-model-server based in Python that does many things....
Some of that belongs in Python - i.e training or anything that uses
numpy
- however there are elements of this codebase that would be useful to have in Go form.What I would propose is:
pkg/model-server-api
that implements this APIpkg/model-server-client
from the OpenAPI spec - this would be used by thepkg/model
inferencing code.pkg/model-db
- which handles interactions with kepler model dbcmd/model-sever
the actual binary that serves the APIThis would then leave the functionality of the estimator and online-trainer in Python since the model pipelines should not need to change. These can either remain in Python, and the REST API from the model server can be adjusted appropriately.
OR
We can call them using Cython from the Go code š¤Æ See: https://poweruser.blog/embedding-python-in-go-338c0399f3d5
Why is this needed?
cmd/exporter
can usepkg/kepler-model-db
to download the latest models if none are in the correct path