Closed tomaszstachera closed 1 year ago
@tomaszstachera There are a number of errors reported when loading the model.
E1011 08:02:59.085540 1 logging.cc:40] [runtime.cpp::isCudaInstalledCorrectly::38] Error Code 6: Internal Error (CUDA initialization failure with error 35. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
That is why the server failed and exited. Appears to be an issue with CUDA setup on the machine.
Additionally, 21.08 is almost a 2 years old release. Can you move to a newer version?
@tomaszstachera There are a number of errors reported when loading the model.
E1011 08:02:59.085540 1 logging.cc:40] [runtime.cpp::isCudaInstalledCorrectly::38] Error Code 6: Internal Error (CUDA initialization failure with error 35. Please check your CUDA installation: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html)
That is why the server failed and exited. Appears to be an issue with CUDA setup on the machine.
Additionally, 21.08 is almost a 2 years old release. Can you move to a newer version?
You are wrong, root cause of the issue were additional files in the S3 bucket - what is nowhere mentioned in the logs. After trial and error of cleaning up the bucket it worked. There was no need to fix CUDA errors.
Newest version from here: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver/tags (23.09-py3) throws different boto3 errors os it is also not working.
BTW I am using Seldon and its newest version is pointing to 21.08. https://artifacthub.io/packages/helm/seldon/seldon-core-operator
Description After successful load of a model it is unloaded without explanation why. Logs:
Triton Information nvcr.io/nvidia/tritonserver:21.08-py3
Are you using the Triton container or did you build it yourself? nvcr.io/nvidia/tritonserver:21.08-py3
To Reproduce Code for SeldonDeployment of inference server:
Content of S3 bucket with Python type model:
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior Server should serve predictions.