triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.07k stars 1.45k forks source link

TRITON with Pytorch CPU only build not working #7460

Open ndeep27 opened 2 months ago

ndeep27 commented 2 months ago

Description Triton Server with Pytorch Backend build not working for CPU_ONLY. It is expecting libraries like libcudart.so even though the build was for CPU. Below is how we invoke the build. From another issue thread - we came to know that the CPU only build was fixed in v22.04 onwards

python ./build.py --cmake-dir=$(pwd) --build-dir=/tmp/citritonbuild --endpoint=http --endpoint=grpc --enable-logging --enable-stats --enable-tracing --enable-metrics --backend=pytorch:${tritonversion} --repo-tag=common:${tritonversion} --repo-tag=core:${tritonversion} --repo-tag=backend:${tritonversion} --repo-tag=thirdparty:${tritonversion} --no-container-build --extra-core-cmake-arg=TRITON_ENABLE_GPU=OFF --extra-core-cmake-arg=TRITON_ENABLE_ONNXRUNTIME_TENSORRT=OFF --extra-backend-cmake-arg=pytorch:TRITON_ENABLE_GPU=OFF --upstream-container-version=22.04

Triton Information What version of Triton are you using? 22.04

Are you using the Triton container or did you build it yourself? Build it ourselves

To Reproduce Steps to reproduce the behavior. Mentioned above

sourabh-burnwal commented 2 months ago

Hi @ndeep27, Can you try with any latest stable versions to see if the issue persists? 22.04 is more than 2 years old and has been a lot of development since then.

ndeep27 commented 2 months ago

@sourabh-burnwal Even the latest version fails (24.07) without giving any specific error. For instance below is what I see in the log

gmake[3]: Leaving directory /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-build' cd /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-build && /usr/local/bin/cmake -E touch /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-stamp/grpc-install [ 84%] Completed 'grpc' cd /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build && /usr/local/bin/cmake -E make_directory /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/CMakeFiles cd /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build && /usr/local/bin/cmake -E touch /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/CMakeFiles/grpc-complete cd /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build && /usr/local/bin/cmake -E touch /tmp/citritonbuild2406/tritonserver/build/_deps/repo-third-party-build/grpc/src/grpc-stamp/grpc-done gmake[2]: Leaving directory/tmp/citritonbuild2406/tritonserver/build' [ 84%] Built target grpc gmake[1]: Leaving directory `/tmp/citritonbuild2406/tritonserver/build' gmake: *** [all] Error 2

I am not sure what is the exact error here

ndeep27 commented 1 month ago

@sourabh-burnwal Can you please help with above?

ndeep27 commented 1 month ago

@sourabh-burnwal Is there a CPU version of triton with pytorch released in open source?

sourabh-burnwal commented 1 month ago

Hi @ndeep27, is there any specific reason you want to build Triton, that too with CPU? You can always control device access while starting the container or from the model config.

ndeep27 commented 1 month ago

@sourabh-burnwal we do access it via model-config where we specify CPU but the issue is triton libraries like libtorch_cpu needs cudaart and other related cuda libraries which is leading for us to not run on CPU

sourabh-burnwal commented 1 month ago

@ndeep27, I have run NGC triton image on a CPU-only system without any problems. Can you share what issue you are getting?

ndeep27 commented 1 month ago

@sourabh-burnwal Can you send me the exact docker image which you used to run?

ndeep27 commented 1 month ago

For instance I downloaded - nvcr.io/nvidia/tritonserver:24.07-pyt-python-py3 and when I ssh to the host and run the below, I see libraries like cudart linked

root@031a384b7f38:/opt/tritonserver# ldd backends/pytorch/libtorch_cpu.so linux-vdso.so.1 (0x00007fff1d4cf000) libc10.so => /opt/tritonserver/backends/pytorch/libc10.so (0x00007c0f0a949000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007c0f0a93c000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007c0f0a91c000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007c0f0a917000) libmkl_intel_lp64.so.1 => /opt/tritonserver/backends/pytorch/libmkl_intel_lp64.so.1 (0x00007c0f09a00000) libmkl_gnu_thread.so.1 => /opt/tritonserver/backends/pytorch/libmkl_gnu_thread.so.1 (0x00007c0f07a00000) libmkl_core.so.1 => /opt/tritonserver/backends/pytorch/libmkl_core.so.1 (0x00007c0eff800000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007c0f0a910000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007c0f0a829000) libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007c0f0a7df000) libcupti.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcupti.so.12 (0x00007c0efee00000) libmpi.so.40 => /opt/hpcx/ompi/lib/libmpi.so.40 (0x00007c0f098e1000) libcudart.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12 (0x00007c0efea00000) libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007c0efe7d4000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007c0efe5ab000) /lib64/ld-linux-x86-64.so.2 (0x00007c0f1699b000) libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007c0f0a7d8000) libopen-rte.so.40 => /opt/hpcx/ompi/lib/libopen-rte.so.40 (0x00007c0f07941000) libopen-pal.so.40 => /opt/hpcx/ompi/lib/libopen-pal.so.40 (0x00007c0efece9000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007c0f0a7ba000)

root@031a384b7f38:/opt/tritonserver# ldd lib/libtritonserver.so linux-vdso.so.1 (0x00007ffc46cfb000) libnuma.so.1 => /lib/x86_64-linux-gnu/libnuma.so.1 (0x0000705fc1e05000) libcudart.so.12 => /usr/local/cuda/targets/x86_64-linux/lib/libcudart.so.12 (0x0000705fc1a00000) libdcgm.so.3 => /lib/x86_64-linux-gnu/libdcgm.so.3 (0x0000705fc16a3000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x0000705fc1de9000) libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x0000705fc1d40000) libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x0000705fc15ff000) libxml2.so.2 => /lib/x86_64-linux-gnu/libxml2.so.2 (0x0000705fc141d000) libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000705fc11f1000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x0000705fc110a000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000705fc1d20000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000705fc0ee1000) /lib64/ld-linux-x86-64.so.2 (0x0000705fc32b1000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000705fc1d19000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x0000705fc1d14000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x0000705fc1d0f000) libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x0000705fc1ce5000) libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x0000705fc1cc4000) librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x0000705fc0ec2000) libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x0000705fc0e55000) libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x0000705fc0e41000) libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x0000705fc09fd000) libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x0000705fc09a9000) libldap-2.5.so.0 => /lib/x86_64-linux-gnu/libldap-2.5.so.0 (0x0000705fc094a000) liblber-2.5.so.0 => /lib/x86_64-linux-gnu/liblber-2.5.so.0 (0x0000705fc0939000) libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x0000705fc086a000) libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x0000705fc1cb2000) libicuuc.so.70 => /lib/x86_64-linux-gnu/libicuuc.so.70 (0x0000705fc066f000) liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x0000705fc0644000) libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x0000705fc049a000) libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x0000705fc02af000) libhogweed.so.6 => /lib/x86_64-linux-gnu/libhogweed.so.6 (0x0000705fc0267000) libnettle.so.8 => /lib/x86_64-linux-gnu/libnettle.so.8 (0x0000705fc0221000) libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x0000705fc019f000) libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x0000705fc00d2000) libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x0000705fc00a3000) libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x0000705fc009d000) libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x0000705fc008f000) libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x0000705fc0074000) libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x0000705fc0051000) libicudata.so.70 => /lib/x86_64-linux-gnu/libicudata.so.70 (0x0000705fbe431000) libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x0000705fbe2f6000) libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x0000705fbe2de000) libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x0000705fbe2d7000) libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x0000705fbe2c3000) libffi.so.8 => /lib/x86_64-linux-gnu/libffi.so.8 (0x0000705fbe2b4000)

On CPUs these are not available

ndeep27 commented 1 month ago

We also cannot use these docker images directly since our OS is a variant of RHEL. So we have to build triton from source for our OS.

sourabh-burnwal commented 1 month ago

@ndeep27 I see. Those files get shipped with the docker image.

Can you start a container with that 24.07 docker image without giving it gpu device access, then try loading your pytorch model after specifying device as cpu in config.pbtxt.

Regarding your use-case of running this in RHEL/CentOS based system. I think, as long as you have docker installed and configured, you should be able to run it.

ndeep27 commented 1 month ago

@sourabh-burnwal What i mean is that we cannot directly use that docker image (these are built for Ubuntu but internally we have to build on RHEL due to security constrainsts). We have to build it from source. I am guessing these containers were also built when building from source. Right? We needed a way to build from source so that it works on CPU too

root@0d33724e583c:/opt/tritonserver# cat /etc/os-release PRETTY_NAME="Ubuntu 22.04.4 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.4 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=jammy

ndeep27 commented 1 month ago

If I copy the files (under /opt/tritonserver) from this docker image and add it to our custom inference pipeline - will that work?

sourabh-burnwal commented 1 month ago

If I copy the files (under /opt/tritonserver) from this docker image and add it to our custom inference pipeline - will that work?

I don't think this will work as the build might also contain system dependencies. I can try to reproduce your issue but that will take some time as I am currently using ubuntu.