GoogleCloudPlatform / cloudml-samples

Cloud ML Engine repo. Please visit the new Vertex AI samples repo at https://github.com/GoogleCloudPlatform/vertex-ai-samples
https://cloud.google.com/ai-platform/docs/
Apache License 2.0
1.52k stars 859 forks source link

Scikit-Learn Custom Code Sample is Broken #419

Closed nnegrey closed 5 years ago

nnegrey commented 5 years ago

Describe the bug When "Deploy your custom prediction routine" at the step creating the model version. The create fails.

What sample is this bug related to? https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/notebooks/scikit-learn/custom-prediction-routine-scikit-learn.ipynb

Source code / logs Running inside colab: ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Failed to load model: Unexpected error when loading the model: 0 (Error code: 0)"

System Information

alecglassford commented 5 years ago

I was able to reproduce this in Colab and I think I found the issue:

Colab has (and the library installation step doesn't override) the latest version of scikit-learn installed (0.21.1). But AI Platform runtime version 1.13 has scikit-learn 0.20.2, which vendors a different version of joblib. I think the problem is that scikit-learn 0.20.2's joblib (on the prediction node) tries to load a model exported by scikit-learn 0.21.1's joblib (in Colab).

Replacing the following installation line:

! pip install numpy scikit-learn

with

! pip install numpy scikit-learn==0.20.2

(and then restarting the runtime) works as as a fix.