When trying to issue simple tensorflow operation, it crashes with
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)
All that I needed to run was,
Install tensorflow-rocm with pip, then run python interpretor,
import tensorflow as tf
tf.add(1,2)
Then the python crashes with the error message.
Entering command from shell:
python -c -d "import tensorflow as tf; tf.add(1,2)"
yields the same error.
Below is part of my rocminfo output:
*******
Agent 2
*******
Name: gfx1031
Uuid: GPU-XX
Marketing Name: AMD Radeon RX 6700 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 3072(0xc00) KB
L3: 98304(0x18000) KB
Chip ID: 29663(0x73df)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2725
BDFID: 3072
Internal Node ID: 1
Compute Unit: 40
SIMDs per CU: 2
Shader Engines: 4
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 12566528(0xbfc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1031
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Standalone code to reproduce the issue
import tensorflow as tf
tf.add(1,2)
Relevant log output
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Aborted (core dumped)
Because your GPU is not directly supported but some other cards similar to yours are, you can try to set this before calling your script:
export HSA_OVERRIDE_GFX_VERSION=10.3.0
Issue Type
Bug
Source
binary
Tensorflow Version
2.9.1
Custom Code
No
OS Platform and Distribution
Linux, Pop! OS 22.04
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
6700 RX, 12 G
Current Behaviour?
Standalone code to reproduce the issue
Relevant log output