mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3k stars 401 forks source link

Export model and import to Keras Tensorflow #113

Closed wulftone closed 3 years ago

wulftone commented 4 years ago

I have an MLJar model I'm trying to run in a Flask app in SageMaker, but it wants to run on multiple threads and is having issues:

  File "/usr/local/lib/python3.8/site-packages/keras/backend/tensorflow_backend.py", line 73, in symbolic_fn_wrapper
    if _SYMBOLIC_SCOPE.value:
  File "src/gevent/local.py", line 408, in gevent._gevent_clocal.local.__getattribute__

I'm running git+https://github.com/mljar/mljar-supervised.git@44e12edb0598eecbf8dc9b587752b94bec5b6359 because I needed the Neural Network model.

See this issue and the many replies. I don't want to disable the multiple threads because I expect a heavy load on this endpoint. One of the solutions that worked for someone was this:

Same problem when loading multiple Keras models via Flask. To solve the problem instead of using:

from keras.models import model_from_json

I used this:

from tensorflow.keras.models import model_from_json

In the future instead of installing keras I will use tensorflow.keras.

I was hoping I could do the same but I'm not sure how to get the JSON out for my ensemble to re-import it using Tensorflow Keras somehow...? Sorry if this is obvious to everyone, I'm new to Keras/TF.

pplonski commented 4 years ago

Hi Trevor,

Never seen such an error. Maybe the most straight forward way will be to do a change in NN code in mljar-supervised in this line. I hope tensorflow.keras.models.model_from_json will be compatible with keras.models.model_from_json (it should be!). Please let me know how it works, maybe I will change this line in master branch.

BTW, I understand that you are building your own Flask application, and you are going to deploy it on AWS? Some time ago, I wrote a tutorial on how to deploy ML models with Django https://www.deploymachinelearning.com/ (tutorial repo) - maybe you will find it helpful. In the case of long-running predictions (computations), it will be nice to have Celery or Python-RQ added.

Please let me know if you fix your problem.

I've found an issue to make it possible to deploy the ML model as REST API (#54). May I ask you what would you like to have in such REST API? only endpoint to access the model or more features (for example, number of requests, request input data, endpoint for feedback)?

wulftone commented 4 years ago

I've forked this repo and made some changes to get it working, thanks! I've made a PR: https://github.com/mljar/mljar-supervised/pull/115.

Additionally, I discovered that your running process must be running in the same directory as the AutoML_# folders or a model won't load up again. I'm doing

AutoML(results_path=model_folder)

and it works if I put my AutoML_0 folder in the same folder as my flask app's main file, but not if AutoML_0 is in another folder somewhere else. Parts of the code are doing os.path.join(self._results_path, "params.json"), which gets a full path, but others are doing os.path.join(model_path, "framework.json"), which only does a relative path. I'm not sure if this assumption flows deeper than I've discovered, but it might be good to do a full path everywhere, relative to _results_path. Since I'm using SageMaker, it likes to put models in /opt/ml/model and it has to be a different directory than the running python app. It would be great to support this somehow, maybe I'm overlooking something...

pplonski commented 3 years ago

Ive moved NN implementation from Keras+Tensorflow to scikit-learn MLPClassifier and MLPRegressor. There is no performance decrease after the change but NN training can be faster. You can read details of TF vs sklearn comparison at https://mljar.com/blog/tensorflow-vs-scikit-learn/