triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.3k stars 1.48k forks source link

Stub process is not healthy. #7186

Closed NDNM1408 closed 6 months ago

NDNM1408 commented 6 months ago

I want to use python backend with triton to deploy TTS model using hifigan and fastpitch. When I infer hifigan, I meet the error

tritonclient.utils.InferenceServerException: [400] Failed to process the request(s) for model instance 'hifigan_0', message: Stub process is not healthy.

This is the content of fime model.py

import json import triton_python_backend_utils as pb_utils from nemo.collections.tts.models import HifiGanModel import torch

class TritonPythonModel:

def initialize(self, args):

    self.model_config = model_config = json.loads(args['model_config'])

    output_config = pb_utils.get_output_config_by_name(
        model_config, "audio")

    # Convert Triton types to numpy types
    self.output_dtype = pb_utils.triton_string_to_numpy(
        output_config['data_type'])
    self.model = HifiGanModel.restore_from("hifigan.nemo")

def execute(self, requests):

    output_dtype = self.output_dtype

    responses = []

    # Every Python backend must iterate over everyone of the requests
    # and create a pb_utils.InferenceResponse for each of them.
    for request in requests:
        input = pb_utils.get_input_tensor_by_name(request, "spec").as_numpy()
        audio = self.model.convert_spectrogram_to_audio(spec=torch.from_numpy(input))
        print(audio)
        print(audio.shape)
        out_tensor = pb_utils.Tensor("audio",
                                       audio.astype(output_dtype))
        inference_response = pb_utils.InferenceResponse(
            output_tensors=[out_tensor])
        responses.append(inference_response)

    # You should return a list of pb_utils.InferenceResponse. Length
    # of this list must match the length of `requests` list.
    return responses

def finalize(self):
    """`finalize` is called only once when the model is being unloaded.
    Implementing `finalize` function is OPTIONAL. This function allows
    the model to perform any necessary clean ups before exit.
    """
    print('Cleaning up...')

Can anyone help?

krishung5 commented 6 months ago

Hi @NDNM1408, I was wondering which Triton version are you using? Could you also provide the model config and any steps required for us to reproduce the issue?

victorsoda commented 6 months ago

Hi @krishung5 , I met with the same problem "Stub process 'add_sub_0_0' is not healthy." when I'm trying to build triton_python_backend_stub of python3.8. The error was like:

I0509 08:28:58.312840 1635 grpc_server.cc:2466] Started GRPCInferenceService at 0.0.0.0:8001
I0509 08:28:58.313128 1635 http_server.cc:4636] Started HTTPService at 0.0.0.0:8000
I0509 08:28:58.354728 1635 http_server.cc:320] Started Metrics Service at 0.0.0.0:8002
terminate called after throwing an instance of 'boost::interprocess::lock_exception'
  what():  boost::interprocess::lock_exception
Signal (6) received.
 0# 0x00005648F7FA23BD in tritonserver
 1# 0x00007F57FBD1E520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill in /lib/x86_64-linux-gnu/libc.so.6
 3# raise in /lib/x86_64-linux-gnu/libc.so.6
 4# abort in /lib/x86_64-linux-gnu/libc.so.6
 5# 0x00007F57FBFA7B9E in /lib/x86_64-linux-gnu/libstdc++.so.6
 6# 0x00007F57FBFB320C in /lib/x86_64-linux-gnu/libstdc++.so.6
 7# 0x00007F57FBFB21E9 in /lib/x86_64-linux-gnu/libstdc++.so.6
 8# __gxx_personality_v0 in /lib/x86_64-linux-gnu/libstdc++.so.6
 9# 0x00007F57FE17F884 in /lib/x86_64-linux-gnu/libgcc_s.so.1
10# _Unwind_RaiseException in /lib/x86_64-linux-gnu/libgcc_s.so.1
11# __cxa_throw in /lib/x86_64-linux-gnu/libstdc++.so.6
12# 0x00007F57E8FD8E1A in /opt/tritonserver/backends/python/libtriton_python.so
13# 0x00007F57E8F94EB0 in /opt/tritonserver/backends/python/libtriton_python.so
14# 0x00007F57E8FA2BBA in /opt/tritonserver/backends/python/libtriton_python.so
15# 0x00007F57E8F8E193 in /opt/tritonserver/backends/python/libtriton_python.so
16# TRITONBACKEND_ModelInstanceExecute in /opt/tritonserver/backends/python/libtriton_python.so
17# 0x00007F57FC71DD74 in /opt/tritonserver/bin/../lib/libtritonserver.so
18# 0x00007F57FC71E0DB in /opt/tritonserver/bin/../lib/libtritonserver.so
19# 0x00007F57FC8329BD in /opt/tritonserver/bin/../lib/libtritonserver.so
20# 0x00007F57FC721D64 in /opt/tritonserver/bin/../lib/libtritonserver.so
21# 0x00007F57FBFE1253 in /lib/x86_64-linux-gnu/libstdc++.so.6
22# 0x00007F57FBD70AC3 in /lib/x86_64-linux-gnu/libc.so.6
23# clone in /lib/x86_64-linux-gnu/libc.so.6

To Reproduce

Below is my process of building triton_python_backend_stub and running the official add_sub example, following the README (https://github.com/triton-inference-server/python_backend?tab=readme-ov-file#building-custom-python-backend-stub):

  1. My docker for compilation: qic_ubuntu_1804_gcc7:1.5.0.3. It is on machine A, which is fast for compilation.
  2. git clone https://github.com/triton-inference-server/python_backend -b main
    cd python_backend
    curl -O https://archives.boost.io/release/1.79.0/source/boost_1_79_0.tar.gz
    sudo apt-get install libarchive-dev
    cd ..
    git clone https://github.com/Tencent/rapidjson.git  # to install rapidjson
    git submodule update --init
    mkdir build && cd build
    cmake ..
    make
    make install
  3. Change all the "std::filesystem" into "std::experimental::filesystem" and "" into "<experimental/filesystem>" under the directory python_backend/src/
  4. cd python_backend
    mkdir build && cd build
    cmake -DTRITON_ENABLE_GPU=OFF -DTRITON_BACKEND_REPO_TAG=main -DTRITON_COMMON_REPO_TAG=main -DTRITON_CORE_REPO_TAG=main -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install -DPYTHON_EXECUTABLE=$(which python3.8) ..   # my python3.8 version is python3.8.12
    make triton-python-backend-stub

    When I executed "ldd triton_python_backend_stub", I saw the below outputs as expected in the README:

    linux-vdso.so.1 (0x00007fff66d6c000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fea9ce70000)
        libpython3.8.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython3.8.so.1.0 (0x00007fea9c71b000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fea9c392000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fea9c17a000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fea9bf5b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fea9bb6a000)
        libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007fea9b938000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fea9b71b000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fea9b517000)
        libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fea9b314000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fea9af76000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fea9d078000)
  5. Then I started an official triton-server docker on another machine B (which is fast for running model inference):
    docker run -itd --privileged --network host --name victor.chen_triton -e HOME=/home/victor.chen -v /root:/root -v /home:/home -v /data:/data  --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864  --ulimit nofile=65536 nvcr.io/nvidia/tritonserver:24.03-py3
  6. And I installed python3.8 in the server docker:
    add-apt-repository ppa:deadsnakes/ppa
    apt update
    apt install python3.8  # Here I could only install python3.8.19 inside the server docker. 
    apt install python3.8-distutils
    apt install python3.8-dev
    python3.8 -m pip install setuptools
  7. Afterwards, I followed the README (https://github.com/triton-inference-server/python_backend?tab=readme-ov-file#quick-start) to start the server with the official add_sub model:
    cd python_backend
    mkdir -p models/add_sub/1/
    cp examples/add_sub/model.py models/add_sub/1/model.py
    cp examples/add_sub/config.pbtxt models/add_sub/config.pbtxt
    cp triton-python-backend-stub models/add_sub/  # My py3.8 stub is expected to be used by this line.
    tritonserver --model-repository `pwd`/models

    The server was successfully started.

  8. Finally, I started another docker (qic_ubuntu_1804_gcc7:latest) as client docker, and tried to run client.py following README:
    python3.8 -m pip install tritonclient[http] opencv-python-headless  (I have python3.8.12)
    python3.8 python_backend/examples/add_sub/client.py

    Then my client.py got an "unhealthy" exception: tritonclient.utils.InferenceServerException: [500] Failed to process the request(s) for model instance 'add_sub_0_0', message: Stub process 'add_sub_0_0' is not healthy. And my triton server had the following error:

    I0509 08:28:58.312840 1635 grpc_server.cc:2466] Started GRPCInferenceService at 0.0.0.0:8001
    I0509 08:28:58.313128 1635 http_server.cc:4636] Started HTTPService at 0.0.0.0:8000
    I0509 08:28:58.354728 1635 http_server.cc:320] Started Metrics Service at 0.0.0.0:8002
    terminate called after throwing an instance of 'boost::interprocess::lock_exception'
    what():  boost::interprocess::lock_exception
    Signal (6) received.
    0# 0x00005648F7FA23BD in tritonserver
    1# 0x00007F57FBD1E520 in /lib/x86_64-linux-gnu/libc.so.6
    2# pthread_kill in /lib/x86_64-linux-gnu/libc.so.6
    3# raise in /lib/x86_64-linux-gnu/libc.so.6
    4# abort in /lib/x86_64-linux-gnu/libc.so.6
    5# 0x00007F57FBFA7B9E in /lib/x86_64-linux-gnu/libstdc++.so.6
    6# 0x00007F57FBFB320C in /lib/x86_64-linux-gnu/libstdc++.so.6
    7# 0x00007F57FBFB21E9 in /lib/x86_64-linux-gnu/libstdc++.so.6
    8# __gxx_personality_v0 in /lib/x86_64-linux-gnu/libstdc++.so.6
    9# 0x00007F57FE17F884 in /lib/x86_64-linux-gnu/libgcc_s.so.1
    10# _Unwind_RaiseException in /lib/x86_64-linux-gnu/libgcc_s.so.1
    11# __cxa_throw in /lib/x86_64-linux-gnu/libstdc++.so.6
    12# 0x00007F57E8FD8E1A in /opt/tritonserver/backends/python/libtriton_python.so
    13# 0x00007F57E8F94EB0 in /opt/tritonserver/backends/python/libtriton_python.so
    14# 0x00007F57E8FA2BBA in /opt/tritonserver/backends/python/libtriton_python.so
    15# 0x00007F57E8F8E193 in /opt/tritonserver/backends/python/libtriton_python.so
    16# TRITONBACKEND_ModelInstanceExecute in /opt/tritonserver/backends/python/libtriton_python.so
    17# 0x00007F57FC71DD74 in /opt/tritonserver/bin/../lib/libtritonserver.so
    18# 0x00007F57FC71E0DB in /opt/tritonserver/bin/../lib/libtritonserver.so
    19# 0x00007F57FC8329BD in /opt/tritonserver/bin/../lib/libtritonserver.so
    20# 0x00007F57FC721D64 in /opt/tritonserver/bin/../lib/libtritonserver.so
    21# 0x00007F57FBFE1253 in /lib/x86_64-linux-gnu/libstdc++.so.6
    22# 0x00007F57FBD70AC3 in /lib/x86_64-linux-gnu/libc.so.6
    23# clone in /lib/x86_64-linux-gnu/libc.so.6
  9. If I remove the models/add_sub/triton-python-backend-stub, restart the server again (this time it is using the default python3.10 in the triton server docker), and execute python3.8 python_backend/examples/add_sub/client.py again, everything will go right:
    root@vir-115-46-001:~/triton# python3.8 python_backend/examples/add_sub/client.py
    INPUT0 ([0.1482776  0.73882675 0.68110615 0.46473113]) + INPUT1 ([0.7229217  0.18495801 0.7215026  0.2987663 ]) = OUTPUT0 ([0.8711993  0.92378473 1.4026088  0.7634974 ])
    INPUT0 ([0.1482776  0.73882675 0.68110615 0.46473113]) - INPUT1 ([0.7229217  0.18495801 0.7215026  0.2987663 ]) = OUTPUT1 ([-0.57464415  0.5538688  -0.04039645  0.16596484])
    PASS: add_sub

    I think step 9 ensures that my triton-python-backend-stub (resulted from step 1 to step 4) is the main cause of the problem.

Could you please tell me what I was doing wrong? How can I make my python3.8 stub healthier?

Tabrizian commented 6 months ago

@victorsoda You need to compile the same branch of the repo as the server. I think the issue is that you're using the main branch with the 24.03version of the server. Could you try building Python backend fromr24.03` branch and let us know if you're still running into an error?

victorsoda commented 6 months ago

@victorsoda You need to compile the same branch of the repo as the server. I think the issue is that you're using the main branch with the 24.03version of the server. Could you try building Python backend fromr24.03` branch and let us know if you're still running into an error?

@Tabrizian Thanks a lot for your quick reply! After changing the branch version from main to r24.03 for the python backend repo, the problem is solved. That's awesome!

sboudouk commented 4 months ago

@victorsoda You need to compile the same branch of the repo as the server. I think the issue is that you're using the main branch with the 24.03version of the server. Could you try building Python backend fromr24.03` branch and let us know if you're still running into an error?

Just want to be sure here, is manually uprading python_backend also also needed if i'm running nvcr.io/nvidia/tritonserver:23.07-py3 docker image ?