deploy_trt.sh closes while executing, server couldn't start

aerenkaradag commented 1 year ago

I'm new to Docker, I installed Docker and Nvidia-container-toolkit on my computer. When I run deploy_trt.sh it starts and closes without any error. Choosing Log Level as DEBUG did not change the output. I also edited the n_gpu and n_workers parameters. What should I do to start the server?

the output like this:

~/InsightFace-REST$ sudo ./deploy_trt.sh 
[+] Building 1.1s (13/13) FINISHED                                                docker:default
 => [internal] load build definition from Dockerfile_trt                                    0.0s
 => => transferring dockerfile: 692B                                                        0.0s
 => [internal] load .dockerignore                                                           0.0s
 => => transferring context: 2B                                                             0.0s
 => [internal] load metadata for nvcr.io/nvidia/tensorrt:23.05-py3                          1.0s
 => [1/8] FROM nvcr.io/nvidia/tensorrt:23.05-py3@sha256:2e8649f3caebc0fdeb14a89f36ae62ddda  0.0s
 => [internal] load build context                                                           0.0s
 => => transferring context: 5.55kB                                                         0.0s
 => CACHED [2/8] RUN apt-get update &&    apt-get install -y            libgl1-mesa-glx     0.0s
 => CACHED [3/8] COPY requirements.txt .                                                    0.0s
 => CACHED [4/8] RUN python -m pip --no-cache-dir install --upgrade -r requirements.txt     0.0s
 => CACHED [5/8] RUN python -m pip --no-cache-dir install --upgrade cupy-cuda12x pynvjpeg   0.0s
 => CACHED [6/8] RUN python -m pip --no-cache-dir install --upgrade onnxruntime-gpu         0.0s
 => CACHED [7/8] WORKDIR /app                                                               0.0s
 => CACHED [8/8] COPY api_trt /app                                                          0.0s
 => exporting to image                                                                      0.0s
 => => exporting layers                                                                     0.0s
 => => writing image sha256:adb893cdd33d172d8240d9db7d8756149f501e06e99281d02f76557626f4d9  0.0s
 => => naming to docker.io/library/insightface-rest:v0.8.3.0                                0.0s
Starting 1 workers on 1 GPUs (1 workers per GPU)
Containers port range: 18081 - 18081
insightface-rest-gpu0-trt
--- Starting container insightface-rest-gpu0-trt with "device=0" at port 18081
e0bf8e88e3052ef31ef1a19cd93d5852a26ba27678173a5da3e3d62d47335c14

SthPhoenix commented 1 year ago

Looks like everything completed successfully, you just need to open localhost:18081 after some time, needed to download modules and build TRT engines.

You can try running docker attach insightface-rest-gpu0-trt to watch what's going on inside container

aerenkaradag commented 1 year ago

Thanks a lot for your answer. When i ran the command you said, i was able to reach error logs. The problem caused by the versions of 'pydantic' library, some modules are only available in old versions and some of them are only in newer version. So, i changed the first lines of InsightFace-REST/src/api_trt/settings.py as follows:

from pydantic.v1.env_settings import BaseSettings from pydantic.v1.validators import str_validator

Now, i can use pydantic-2.0.2 without any problem.

SthPhoenix commented 1 year ago

Thanks a lot for your answer. When i ran the command you said, i was able to reach error logs. The problem caused by the versions of 'pydantic' library, some modules are only available in old versions and some of them are only in newer version. So, i changed the first lines of InsightFace-REST/src/api_trt/settings.py as follows:

from pydantic.v1.env_settings import BaseSettings from pydantic.v1.validators import str_validator

Now, i can use pydantic-2.0.2 without any problem.

Hm, that's interesting, thanks! I'll update repo for compatibility with new pydantic version shortly.

UPDATE: Fixed, should work out of the box for now. I have tested new pydantic_settings.BaseSettings, there seems to be some issues with parsing environment variables, though I haven't figured out yet if its a bug or some changes in class behavior.

SthPhoenix / InsightFace-REST

deploy_trt.sh closes while executing, server couldn't start #114