openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.01k stars 2.21k forks source link

[Bug]: Not able to load TensorFlow model on OpenVino model server #25346

Closed deepanshuhardaha closed 1 month ago

deepanshuhardaha commented 3 months ago

OpenVINO Version

openvino/model_server:2024.2

Operating System

Ubuntu 18.04 (LTS)

Device used for inference

CPU

Framework

Keras (TensorFlow 2)

Model used

https://github.com/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb

Issue description

🐛 Describe the bug

Before deploying our production model, we attempted to load a basic TensorFlow model on Kubernetes using the OpenVino model server image. I followed the steps from the official TensorFlow notebook here to create a simple model. Afterward, I exported the model using the command: probability_model.export( "<gcs_base_path>/3")

I was able to load and infer this exported model using vanilla TensorFlow Serving (used tensorflow/serving:2.16.1 image).

However, when deploying the same exported model on the OpenVino server, I encountered the following error

 Check 'is_conversion_successful' failed at src/frontends/tensorflow/src/frontend.cpp:478:
FrontEnd API failed with OpConversionFailure:
[TensorFlow Frontend] Internal error, conversion is failed for VarHandleOp operation with a message:
Variable or resource `sequential/dense_1/bias_1` is not initialized, model is inconsistent
Screenshot 2024-07-03 at 12 31 07 PM

Environment

Exported model

You can find the exported model here on Google Drive.

Container configuration snippet from deployment YAML

containers:
      - name: openVinoServer
        image: openvino/model_server:latest
        args:
        - --model_path=<gcs_base_path>
        - --model_name=dummy-tf
        - --port=8500
        - --target_device=CPU

Is this a bug or am I missing something?

Thanks!

Step-by-step reproduction

No response

Relevant log output

No response

Issue submission checklist

deepanshuhardaha commented 3 months ago

I tried the BERT model downloaded from Kaggle on the same setup, and it is also failing with the same error image

Iffa-Intel commented 2 months ago

@deepanshuhardaha could you provide:

  1. Steps that you did to load a basic TensorFlow model on Kubernetes using the OpenVino model server image
  2. How you convert your model into OpenVINO format
  3. Commands involved
  4. Did you do any modifications to the model or it's exactly as the tutorial you shared?
deepanshuhardaha commented 2 months ago
  1. I used the following Kubernetes deployment YAML to load it

    deployment.yaml ``` apiVersion: apps/v1 kind: Deployment metadata: name: ovms-deployment namespace: labels: service: ovms-deployment spec: replicas: 1 strategy: type: RollingUpdate rollingUpdate: maxSurge: 20% maxUnavailable: 10% selector: matchLabels: service: ovms-deployment template: metadata: labels: service-prometheus-track: ovms-deployment annotations: prometheus.io/scrape: 'true' prometheus.io/port: '8501' prometheus.io/path: /metrics spec: terminationGracePeriodSeconds: 30 affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: service operator: In values: - ovms-deployment topologyKey: "kubernetes.io/hostname" weight: 100 nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: cloud.google.com/gke-preemptible operator: Exists weight: 100 containers: - name: openVinoServer image: openvino/model_server:latest args: - --model_path= - --model_name=dummy-tf - --port=8500 - --target_device=CPU imagePullPolicy: Always ports: - containerPort: 8500 name: grpc protocol: TCP - containerPort: 8501 name: http protocol: TCP env: - name: GOOGLE_APPLICATION_CREDENTIALS value: - name: RANDOM_UUID value: ${ parameters["random_uuid"] } resources: limits: cpu: 10000m memory: 10Gi requests: cpu: 5000m memory: 5Gi terminationGracePeriodSeconds: 35 ```
  2. I didn't convert the TensorFlow model into OpenVino format because the OpenVino documentation states that we can skip model conversion and run inference directly from the TensorFlow source format.
  3. I did not use any commands to convert or host the model on OVMS.
  4. It's exactly as the tutorial I shared. I didn't modify anything.
Iffa-Intel commented 2 months ago

We'll further investigate this and get back to you

avitial commented 1 month ago

@deepanshuhardaha sorry for the delay, I had no issues running this model (both in TF .pb format and in IR format) on model server OpenVINO 2024.3 version (_openvino/modelserver:2024.3). Please note although you are not manually performing model conversion to run inference, model conversion is still performed automatically and “under the hood”. That being said like you said, you should be able to load the TF model as is without prior manual conversion. :)

Is it possible to try on the latest model_server image and see if the issue still occurs? It might also be due to improper model directory structure, make sure the you are following the particular directory structure rules as outlined in Prepare a Model Repository. Please also double check that. Hope this helps!

image

$ docker run -d --rm -v ${PWD}/models:/models -p 9000:9000 -p 8000:8000 openvino/model_server:2024.3 --model_path /models/tfmodel/ --model_name tfmodel --port 9000 --rest_port 8000 --log_level DEBUG

$ python predict.py
result: [[3.6116101e-06 1.4753303e-07 1.6656575e-01 7.8238291e-01 1.3858130e-10
  4.9078122e-02 6.5075537e-06 6.5629570e-07 1.9615707e-03 7.1864463e-07]]
# contents of predict.py
import numpy as np
from ovmsclient import make_grpc_client

client = make_grpc_client("localhost:9000")

# generate input data
img = np.random.random((1, 28, 28)).astype(np.float32)

output = client.predict({"keras_tensor_5": img}, "tfmodel")
result_index = np.argmax(output[0])
print("result:", output)
avitial commented 1 month ago

Closing this, hope previous responses were sufficient to help you proceed. Feel free to reopen to ask additional questions.