Ubuntu Container for Jetson ORIN

Hi, thank you so much this resource. it's been really helpful in guiding me in my work.

I have a docker setup working with l4t-jetpack:r36.3.0. However I am at a juncture where I am having to start with a non-jetpack base image, hence I am working with nvidia/cuda:12.2.0-devel-ubuntu22.04

I am able to install onnxruntime (from source), pytorch 2.3 with nvidia/cuda:12.2.0-devel-ubuntu22.04 as base image.

I have tiny scripts to check for GPU access in Pytorch and ORT, both of which show that it's able to find the GPU. However, it fails when it's trying to load the onnx model.

import torch

is_cuda_available = torch.cuda.is_available()
print(f"Is CUDA available: {is_cuda_available}")
assert is_cuda_available, "CUDA is not available"

print(f"Current CUDA device: {torch.cuda.current_device()}")

print(f"Number of GPUs: {torch.cuda.device_count()}")

print(f"CUDA device name: {torch.cuda.get_device_name(torch.cuda.current_device())}")

print(f"CUDA memory allocated: {torch.cuda.memory_allocated() / (1024 ** 2):.2f} MB")
print(f"CUDA memory cached: {torch.cuda.memory_reserved() / (1024 ** 2):.2f} MB")

# print total GPU memory
if is_cuda_available:
    total_memory = torch.cuda.get_device_properties(torch.cuda.current_device()).total_memory
    print(f"Total GPU memory: {total_memory / (1024 ** 2):.2f} MB")
else:
    print("CUDA is not available.")

if torch.backends.cudnn.is_available():
    print("cuDNN is installed.")
else:
    print("cuDNN is not installed.")

print(torch.backends.cudnn.version())

Is CUDA available: True Current CUDA device: 0 Number of GPUs: 1 CUDA device name: Orin CUDA memory allocated: 0.00 MB CUDA memory cached: 0.00 MB Total GPU memory: 62841.44 MB cuDNN is installed. 8907

import onnxruntime as ort

sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0
providers = ort.get_available_providers()
print("Available providers:", providers)

if 'CUDAExecutionProvider' in providers:
    print("GPU is available for ONNX Runtime")
else:
    print("GPU is not available for ONNX Runtime")

model_path = "epoch103.nms.fp32.simplified.onnx"
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
session = ort.InferenceSession(model_path, sess_options=sess_options, providers=providers)
print(f"Model loaded successfully: {model_path}")

2024-10-10 00:28:10.301568556 [E:onnxruntime:, inference_session.cc:2117 operator()] Exception during initialization: /opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; SUCCTYPE = cublasStatus_t; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; SUCCTYPE = cublasStatus_t; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=ubuntu ; file=/opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=178 ; expr=cublasCreate(&cublas_handle_);

It feels like I'm almost there.. but am (obsviously) missing a piece of the puzzle. Upon searching I see there are CUDA Toolkits for jetson specifically (not sure if this is source of issue even) however I didn't find one for CUDA 12.2, Ubuntu 22.04.

Could someone help understand/resolve what's going on with this CUBLAS thing?

dpkg -L libcudnn8, dpkg -L libcudnn8-dev on both the containers should the same paths.

Hi, thank you so much this resource. it's been really helpful in guiding me in my work.

I have a docker setup working with l4t-jetpack:r36.3.0. However I am at a juncture where I am having to start with a non-jetpack base image, hence I am working with nvidia/cuda:12.2.0-devel-ubuntu22.04

I am able to install onnxruntime (from source), pytorch 2.3 with nvidia/cuda:12.2.0-devel-ubuntu22.04 as base image.

I have tiny scripts to check for GPU access in Pytorch and ORT, both of which show that it's able to find the GPU. However, it fails when it's trying to load the onnx model.
import torch

is_cuda_available = torch.cuda.is_available()
print(f"Is CUDA available: {is_cuda_available}")
assert is_cuda_available, "CUDA is not available"

print(f"Current CUDA device: {torch.cuda.current_device()}")

print(f"Number of GPUs: {torch.cuda.device_count()}")

print(f"CUDA device name: {torch.cuda.get_device_name(torch.cuda.current_device())}")

print(f"CUDA memory allocated: {torch.cuda.memory_allocated() / (1024 ** 2):.2f} MB")
print(f"CUDA memory cached: {torch.cuda.memory_reserved() / (1024 ** 2):.2f} MB")

# print total GPU memory
if is_cuda_available:
    total_memory = torch.cuda.get_device_properties(torch.cuda.current_device()).total_memory
    print(f"Total GPU memory: {total_memory / (1024 ** 2):.2f} MB")
else:
    print("CUDA is not available.")

if torch.backends.cudnn.is_available():
    print("cuDNN is installed.")
else:
    print("cuDNN is not installed.")

print(torch.backends.cudnn.version())
Is CUDA available: True Current CUDA device: 0 Number of GPUs: 1 CUDA device name: Orin CUDA memory allocated: 0.00 MB CUDA memory cached: 0.00 MB Total GPU memory: 62841.44 MB cuDNN is installed. 8907
import onnxruntime as ort

sess_options = ort.SessionOptions()
sess_options.log_severity_level = 0
providers = ort.get_available_providers()
print("Available providers:", providers)

if 'CUDAExecutionProvider' in providers:
    print("GPU is available for ONNX Runtime")
else:
    print("GPU is not available for ONNX Runtime")

model_path = "epoch103.nms.fp32.simplified.onnx"
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
session = ort.InferenceSession(model_path, sess_options=sess_options, providers=providers)
print(f"Model loaded successfully: {model_path}")
2024-10-10 00:28:10.301568556 [E:onnxruntime:, inference_session.cc:2117 operator()] Exception during initialization: /opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; SUCCTYPE = cublasStatus_t; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cublasStatus_t; bool THRW = true; SUCCTYPE = cublasStatus_t; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUBLAS failure 3: CUBLAS_STATUS_ALLOC_FAILED ; GPU=0 ; hostname=ubuntu ; file=/opt/onnxruntime/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=178 ; expr=cublasCreate(&cublas_handle_);

It feels like I'm almost there.. but am (obsviously) missing a piece of the puzzle. Upon searching I see there are CUDA Toolkits for jetson specifically (not sure if this is source of issue even) however I didn't find one for CUDA 12.2, Ubuntu 22.04.

Could someone help understand/resolve what's going on with this CUBLAS thing?

dpkg -L libcudnn8, dpkg -L libcudnn8-dev on both the containers should the same paths.

you are using docker for x86. Not for Jetson!!!!

docker jetson: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-base

@johnnynunez nvidia/cuda:12.2.0-devel-ubuntu22.04 comes with a arm64 build as well. What would be difference between L4T Docker containers vs a cuda-ubuntu arm64 image?

If you had actually read my post, you would notice that I have a working version of the docker with L4T Jetson base :)

I am trying to understand if and why a arm64 docker image would or would not work (including GPU access) on Jetson.

@johnnynunez nvidia/cuda:12.2.0-devel-ubuntu22.04 comes with a arm64 build as well. What would be difference between L4T Docker containers vs a cuda-ubuntu arm64 image?

If you had actually read my post, you would notice that I have a working version of the docker with L4T Jetson base :)

I am trying to understand if and why a arm64 docker image would or would not work (including GPU access) on Jetson.

l4t version comes with everything. I think that ubuntu arm is for SBSA(grace etc)

dusty-nv / jetson-containers

Ubuntu Container for Jetson ORIN #673