triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.34k stars 1.49k forks source link

Build AMD64 Triton from ARM64 machine generate ARM64 architecture executable file #7745

Open ti1uan opened 3 weeks ago

ti1uan commented 3 weeks ago

Description Hi, I tried to build x86_64 Triton from source using build.py script in an ARM64 machine, but the executable file was in arm64 architecture. command: python3 build.py --target-platform=linux --target-machine=x86_64 --backend=python --backend=fil --endpoint=sagemaker --enable-logging --enable-metrics --enable-stats --build-type=Release

Triton Information What version of Triton are you using? 24.09

Are you using the Triton container or did you build it yourself? Build it by myself

To Reproduce Steps to reproduce the behavior. Run python3 build.py --target-platform=linux --target-machine=x86_64 --backend=python --backend=fil --endpoint=sagemaker --enable-logging --enable-metrics --enable-stats --build-type=Release in an ARM64 machine, then go to tritonserver image docker run -it tritonserver /bin/bash and install file: apt-get update && apt-get install -y file, finally check the executable tritonserver file file /opt/tritonserver/bin/tritonserver

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). N/A

Expected behavior A clear and concise description of what you expected to happen. With --target-machine=x86_64 option, the executable file should be amd64 architecture.

rmccorm4 commented 3 days ago

Hi @ti1uan, thanks for raising this issue.

@nv-kmcgill53 @mc-nv are you familiar with our support for cross-compiling with build.py? I'm not sure if we currently have tests that verify everything end-to-end works as expected when cross-compiled, rather than compiling on the host architecture.

mc-nv commented 3 days ago

With docker users can use QEMU emulator and have multi-platform build. Implementation of it can be very costly in time and hard in troubleshooting (especially while our scenarios include Docker-out-of-Docker configuration).

We probably have to defer this task, till we start refactoring CMake configuration.