SeldonIO / seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://www.seldon.io/tech/products/core/
Other
4.39k stars 833 forks source link

Pre-trained models like universal-sentence-encoder from tensorflow hub having issue while serving #4400

Closed ashwini-tgam closed 2 years ago

ashwini-tgam commented 2 years ago

We have many pre trained model from tfhub which does not required any thing other than serving directly using any serving layer.To try the same I have used seldon core serving technique to make it available for serving layer.I tried to access the api by

curl --location --request POST 'https://{serving_url}/seldon/seldon/embedding/api/v1.0/predictions' \
--header 'Content-Type: application/json' \
--data-raw '{
    "data": {
        "ndarray": [
            "testing service"
        ]
    }
}'

however it error out with

{"status":{"code":-1,"info":"HTTPConnectionPool(host='0.0.0.0', port=2001): Max retries exceeded with url: /v1/models/embedding:predict (Caused by NewConnectionError('\u003curllib3.connection.HTTPConnection object at 0x7f69aeff2b10\u003e: Failed to establish a new connection: [Errno 111] Connection refused'))","reason":"MICROSERVICE_INTERNAL_ERROR","status":1}}

To reproduce

I have used multiple techniques below are the details:

  1. This is where I have downloaded model from tfhub and serve that using below yaml
    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
    name: embedding
    namespace: seldon
    spec:
    name: embedding
    predictors:
    - graph:
      serviceAccountName: poc-seldon-sa
      name: embedding
      type: MODEL
      implementation: TENSORFLOW_SERVER
      modelUri: gs://tf_models_test/embedding/dan/4
      parameters:
        - name: model_input
          type: STRING
          value: text
        - name: model_output
          type: STRING
          value: embedding
      endpoint:
        type: GRPC
        type: REST
    name: embedding
    replicas: 1
    labels:
      nodepool: general
  2. Option 2 I used to serve it by custom model mentioned here Embedding.py
    
    from seldon_core.user_model import SeldonResponse
    import tensorflow as tf
    import tensorflow_hub as hub
    import logging

DAN_MODEL_URI = "https://tfhub.dev/google/universal-sentence-encoder/4"

class Embedding(object): """ Model template. You can load your model parameters in init from a location accessible at runtime """

def __init__(self):
    """
    Add any initialization parameters. These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.
    """
    self._model = hub.load(DAN_MODEL_URI)

def predict(self, X, features_names=None, meta={}):
    logging.info(f"model meta: {meta}")
    embedding = self._model([X]).numpy().tolist()[0]

    return SeldonResponse(data=embedding)

def init_metadata(self):

    meta = {
        "name": "embedding",
        "versions": ["dan4"],
        "platform": "seldon",
        "inputs": [
            {
                "messagetype": "text",
            }
        ],
        "outputs": [{"messagetype": "tensor", "schema": {"shape": [512]}}],
        "custom": {
            "author": "sophi-dev"
        }
    }
    return meta

requirements.txt:

seldon_core==1.14.1 numpy==1.23.4 tensorflow==2.8.3 tensorflow-hub==0.12.0


Dockerfile:

FROM python:3.9-slim

ARG TF_CACHE_DIR="/var/tmp/tfhub_modules"

WORKDIR /app

Install python packages

COPY requirements.txt requirements.txt

RUN pip install -r requirements.txt

RUN apt-get -y update && \ apt-get install -y screen && \ apt-get install -y curl && \ apt-get install -y wget && \ apt-get install -y tar

Copy source code

COPY . .

RUN mkdir -p /app${TF_CACHE_DIR} && \ chmod -R 777 /app

Define environment variables

ENV MODEL_NAME=Embedding ENV SERVICE_TYPE=MODEL ENV CUDA_VISIBLE_DEVICES=-1 ENV TFHUB_CACHE_DIR=${TF_CACHE_DIR}

RUN python -c "import tensorflow as tf; import tensorflow_hub as hub; hub.load('https://tfhub.dev/google/universal-sentence-encoder/4')" RUN ls -l $TFHUB_CACHE_DIR

Port for GRPC

EXPOSE 5001

Port for REST

EXPOSE 8000

Changing folder to default user

RUN chown -R 8888 /app

CMD exec seldon-core-microservice $MODEL_NAME --service-type $SERVICE_TYPE


embedding_yaml:

apiVersion: machinelearning.seldon.io/v1 kind: SeldonDeployment metadata: name: embedding namespace: seldon spec: name: embedding predictors:

Environment

Seldon 1.14.1

Cloud Provider: AWS Kubernetes Cluster Version v1.21.1

ashwini-tgam commented 2 years ago

Issue with EKS