Open stevelcb opened 2 weeks ago
Hi,
I believe you have to include the ROCMExecutionProvider
in graxpert/ai_model_handling.py
, cf.:
https://github.com/Steffenhir/GraXpert/blob/593cdddf9b3a633e63b7cd45a86903e03b09c89b/graxpert/ai_model_handling.py#L169
CS, David
Thanks David Unfortunately:
python -m graxpert.main
2024-06-18 22:13:02,761 MainProcess root WARNING Could not check for newest version
2024-06-18 22:13:11,367 ForkProcess-2 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,367 ForkProcess-3 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,367 ForkProcess-4 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,367 ForkProcess-2 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,367 ForkProcess-3 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,367 ForkProcess-4 root INFO stretch.stretch_channel started
2024-06-18 22:13:11,822 ForkProcess-2 root INFO stretch.stretch_channel finished
2024-06-18 22:13:11,822 ForkProcess-2 root INFO stretch.stretch_channel finished
2024-06-18 22:13:11,853 ForkProcess-3 root INFO stretch.stretch_channel finished
2024-06-18 22:13:11,853 ForkProcess-4 root INFO stretch.stretch_channel finished
2024-06-18 22:13:11,853 ForkProcess-3 root INFO stretch.stretch_channel finished
2024-06-18 22:13:11,853 ForkProcess-4 root INFO stretch.stretch_channel finished
2024-06-18 22:13:24,273 MainProcess root INFO Progress: 8%
2024-06-18 22:13:24,278 MainProcess root INFO Progress: 16%
2024-06-18 22:13:24,280 MainProcess root INFO Progress: 24%
2024-06-18 22:13:24,280 MainProcess root INFO Progress: 32%
2024-06-18 22:13:25,119 MainProcess root INFO Providers : ['ROCMExecutionProvider', 'CPUExecutionProvider']
2024-06-18 22:13:25,119 MainProcess root INFO Used providers : ['ROCMExecutionProvider', 'CPUExecutionProvider']
rocBLAS error from hip error code: 'hipErrorInvalidDeviceFunction':98
2024-06-18 22:13:25.122326547 [E:onnxruntime:Default, rocm_call.cc:119 RocmCall] ROCBLAS failure 6: rocblas_status_internal_error ; GPU=0 ; hostname=cocina ; file=/onnxruntime/build/Linux/Release/amdgpu/onnxruntime/core/providers/rocm/tensor/transpose.cc ; line=65 ; expr=rocblasTransposeHelper(stream, rocblas_handle, rocblas_operation_transpose, rocblas_operation_transpose, M, N, &one, input_data, N, &zero, input_data, N, output_data, M);
2024-06-18 22:13:25.122341807 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Transpose node. Name:'StatefulPartitionedCall/model/sequential/conv2d/Conv2D__6' Status Message: ROCBLAS failure 6: rocblas_status_internal_error ; GPU=0 ; hostname=cocina ; file=/onnxruntime/build/Linux/Release/amdgpu/onnxruntime/core/providers/rocm/tensor/transpose.cc ; line=65 ; expr=rocblasTransposeHelper(stream, rocblas_handle, rocblas_operation_transpose, rocblas_operation_transpose, M, N, &one, input_data, N, &zero, input_data, N, output_data, M);
2024-06-18 22:13:25,177 MainProcess root ERROR [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Transpose node. Name:'StatefulPartitionedCall/model/sequential/conv2d/Conv2D__6' Status Message: ROCBLAS failure 6: rocblas_status_internal_error ; GPU=0 ; hostname=cocina ; file=/onnxruntime/build/Linux/Release/amdgpu/onnxruntime/core/providers/rocm/tensor/transpose.cc ; line=65 ; expr=rocblasTransposeHelper(stream, rocblas_handle, rocblas_operation_transpose, rocblas_operation_transpose, M, N, &one, input_data, N, &zero, input_data, N, output_data, M);
Traceback (most recent call last):
File "/home/steve/GraXpert/graxpert/application/app.py", line 149, in on_calculate_request
extract_background(
File "/home/steve/GraXpert/graxpert/background_extraction.py", line 80, in extract_background
background = session.run(None, {"gen_input_image": np.expand_dims(imarray_shrink, axis=0)})[0][0]
File "/home/steve/GraXpert/graxpert-env/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Transpose node. Name:'StatefulPartitionedCall/model/sequential/conv2d/Conv2D__6' Status Message: ROCBLAS failure 6: rocblas_status_internal_error ; GPU=0 ; hostname=cocina ; file=/onnxruntime/build/Linux/Release/amdgpu/onnxruntime/core/providers/rocm/tensor/transpose.cc ; line=65 ; expr=rocblasTransposeHelper(stream, rocblas_handle, rocblas_operation_transpose, rocblas_operation_transpose, M, N, &one, input_data, N, &zero, input_data, N, output_data, M);
log attached graxpert.log.5.txt
Ubuntu 22.04
Hi everyone I thought I'd update on this having tried to get ROCm gpu acceleration recognised via onnx.
We created the environment for building GraX as here: https://github.com/Steffenhir/GraXpert
Then activated AMD's ROCm as here: https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/install-onnx.html
That works fine and AMD's ROCm is indeed available via onnxruntime:
We then build... However, GraX still sees only the CPU:
Reading to the end of the AMD document, I see that it works with: Radeon: RX 7900 XTX, RX 7900 XT, RX 7900, GRE PRO W7900 and PRO W7800
I have a gfx90, so not sure if the gpu will be visible to GraX. It is to other programs, such as StarTools but that's via opencl.
Still thinking... Any ideas anyone? Cheers and TIA