mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation

https://mljar.com

MIT License

3.02k stars 403 forks source link

Compute predictions from selected model #423

Open pplonski opened 3 years ago

pplonski commented 3 years ago

Details in discussion https://github.com/mljar/mljar-supervised/discussions/421

asuzukosi commented 3 years ago

I'll like to work on this

pplonski commented 3 years ago

@asuzukosi great!

Some tips:

You can pass model name in the predict() method. Model name will be simply a string that is displayed in the report.
When AutoML is loading models from the disk, only models needed by the best_model are loaded (for speed reason). You need to ensure that model selected for prediction is loaded. In the case of Ensemble that might be several models.

Please let me know if you need more help.

matrixhead commented 3 years ago

hey, is this issue still open? can i work on this

pplonski commented 3 years ago

hey @matrixhead!

the issue is still open, I'm assigning it to you

matrixhead commented 3 years ago

460

I don't know this is the right way to do it

pplonski commented 3 years ago

@matrixhead few comments:

Have you tested the code? Did you write the unit tests? I would suggest to write unit tests. Do you have idea what units tests can be added to cover corner cases?
I would name the argument model_name.
When loading the model from hard drive the only models from load_on_predict parameter in params.json file are loaded. Those are models needed be best_model. We need to assure that selected model by user is loaded.
There are predict(), predict_proba() and predict_all() methods. They should support the model_name parameter.

matrixhead commented 3 years ago

no I didn't write tests, I will try to do that.
ok
can I do it this way, first by calling load() from _predict() and then checking user-specified model is present in the _models or not
okay i will do that

matrixhead commented 3 years ago

460

Kshitij68 commented 2 years ago

Hi @pplonski , I just opened a PR for this. I'm sure that more work would be required before we can merge it, but early feedback would be very helpful. Thanks!

dermodmaster commented 7 months ago

Quick and dirty workaround until this is oficially implemented:

from supervised.model_framework import ModelFramework
model = ModelFramework.load("AUTOML_FOLDER_NAME_HERE", model_subpath="MODEL_FOLDER_NAME_HERE")

For regression results:

y_pred = automl._base_predict(qdqd_X_test, model=model)["prediction"].to_numpy()

For classification results:

y_pred = automl._base_predict(qdqd_X_test, model=model)["label"].to_numpy()

Hopefully i can save someones time. :D

Reese-Martin commented 3 months ago

Quick and dirty workaround until this is oficially implemented:
from supervised.model_framework import ModelFramework
model = ModelFramework.load("AUTOML_FOLDER_NAME_HERE", model_subpath="MODEL_FOLDER_NAME_HERE")
For regression results:
y_pred = automl._base_predict(qdqd_X_test, model=model)["prediction"].to_numpy()
For classification results:
y_pred = automl._base_predict(qdqd_X_test, model=model)["label"].to_numpy()
Hopefully i can save someones time. :D

I am trying to implement this work around, but I am getting the error "module 'supervised.automl' has no attribute '_base_predict'". Do I need to be loading automl in a different way than "from supervised import automl"?

pplonski commented 3 months ago

Thank you @Reese-Martin, during summer we will have intern, I hope this will be implemented :)