triton-inference-server / fastertransformer_backend

BSD 3-Clause "New" or "Revised" License
411 stars 133 forks source link

cuda function architecture error when trying to query the triton server . #98

Closed gd1m3y closed 1 year ago

gd1m3y commented 1 year ago

Hey, My setup is as follows ->

Branch: v1.3 Version: TRITON_VERSION=22.07 GPU: NVIDIA GeForce GTX 1080 Ti * 4 CUDA INFO : Cuda compilation tools, release 11.7, V11.7.99 Build cuda_11.7.r11.7/compiler.31442593_0 NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7

STEPS TO REPRODUCE -> i followed the guide mentioned here link and then got the ERROR while running the identity test.

Error received -> I got this error while running the IDENTITY_TEST

set request
terminate called after throwing an instance of 'std::runtime_error'
terminate called recursively
terminate called recursively
terminate called recursively
  what():  Signal ([FT][ERROR] CUDA runtime error: invalid device function /workspace/build/fastertransformer_backend/build/_deps/repo-ft-src/src/fastertransformer/kernels/sampling_topp_kernel
s.cu:1077
Signal (
Signal (66) received.Signal (6) received.
6
) received.) received.

ERROR SUMMARY -> It’s invalid device function in Top P Sampling Line is where it calls check_cuda_error we also faced the same problem with different containers + cuda versions like -> [CUDA 11.6 - CONTAINER 22.01 and 22.02 , CUDA 12 (latest ) with triton latest container ]

Is this error have something to do with my GPU ? since i have tried multiple combinations + guides of fastertransformer backend even the latest version ! i.e., v1.4 gives the same error. can someone please help / guide what we might be doing wrong thank you .

byshiue commented 1 year ago

1080 Ti is pascal architecture, which is not compiled by default. For more details, please refer the compilation setting of FT https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gpt_guide.md#build-the-project

gd1m3y commented 1 year ago

Hey @byshiue thankyou for the great pointer i now get why was it not working . BTW can you tell me what would the most stable cuda version and triton version to install for GTX 1080 ti ? Thanks

byshiue commented 1 year ago

For cuda/triton version, you can try the docker suggested in the document. We have verified the correctness on these docker images.

gd1m3y commented 1 year ago

Updating to newer version of Faster Transformer and CUDA and building FT with lower SM 6.1 solved this issue for me. Thanks @byshiue