Open jarvis-hal opened 5 years ago
Would it be possible to get some feedback on whether this is something being worked on/considered?
Just wanted to bump this one more time to see if this is being considered?
I have one V1000 on my cube, and I do want to see embedded APU be put onto ROCm roadmap but unfortunately there is no ETA for this. Notice this is not really an HCC issue but a Linux driver-level issue so the best place is https://github.com/RadeonOpenCompute/ROCm/issues/435.
Thanks, I was looking at that thread too. I ended up here because of a comment by kentrussell:
That's something you should bring up in the HCC github Bug Reports (https://github.com/RadeonOpenCompute/hcc). ROCK/ROCT/ROCR support APUs, but HCC doesn't. If you want to get TensorFlow support for APUs, I'd bring it to their attention directly. The base has support (kernel/thunk/runtime), but HCC obviously needs to support it to get in there
And the person who made this thread was participating in that one too.
@EvilPictureBook thanks. I didn't know kernel side of things may have been cleared up. Not so sure if CRAT table on BIOS for V1000 though.
Say in case one's able to run /opt/rocm/bin/rocminfo
on V1000, which validates if ROCK/ROCT/ROCR truly works on the system, then enabling codegen for it is relatively an easy task in HCC:
Clang driver: https://github.com/RadeonOpenCompute/hcc-clang-upgrade/blob/clang_tot_upgrade/lib/Driver/ToolChains/Hcc.cpp#L175
Link with ROCm-Device-Libs and invoking LLVM backend: https://github.com/RadeonOpenCompute/hcc/blob/clang_tot_upgrade/lib/clamp-device.in#L151
Need to change those 2 places to ensure proper GCN ISA version are passed to match ISA version for V1000, which could be retrieved thru /opt/rocm/bin/rocminfo
.
Unfortunately I haven't been able to run /opt/rocm/bin/rocminfo
at least on my system yet, which led me to wonder it be a lower-level (ROCK/ROCT/ROCR / BIOS) issue.
It is possible to get rocm working on APUs. You can read about the effort here, https://bruhnspace.com/en/bruhnspace-rocm-for-amd-apus/ and download deb files from https://bruhnspace.com/rocm-apu/.
ROCm 2.6 built with APU support (gfx801, carrizo, bristol ridge, and gfx902 Raven Ridge). rocm 2.7 is a mess... will build 2.8/2.9 if the ongoing renaming chaos has settled down.
I tried their (https://bruhnspace.com) .deb files on docker image of Ubuntu 18 (Under Debian "Bullseye" with kfd working). The hardware was Ryzen 5 PRO 3350G. Got it to the point where TF and PyTorch were reporting CUDA support on the APU. But any execution even of the simplest op on the GPU failed. For this reason, I'm waiting the prices of old Tesla M40 to drop a bit.
Currently, AMD Ryzen Embedded V1000 and APU are not supported by HCC which does not allow Tensorflow to take advantage of the APU power coming from AMD Ryzen Embedded V1000 series. This could be hindering many low power AI applications which V1000 could enable. Please help to add support.