ROCm / hcc

HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform
https://github.com/RadeonOpenCompute/hcc/wiki
Other
430 stars 108 forks source link

ROCm and HCC - Compilation issues and level of support for Tonga GPU? #810

Open DGASUK opened 6 years ago

DGASUK commented 6 years ago

I was impressed with the aspirations of AMD's open platform for GPGPU computing and recently installed ROCm and HCC but have had some problems with the installation and in compiling the HCC example applications.

Hardware: Rocminfo correctly identifies Intel Core i7-6700 CPU as Agent 1 and GPU gfx802 as Agent 2 with ISA 1 named as amdgcn-amd-amdhsa--gfx802. Clinfo also shows correct information, identifying AMD APP platform and GPU-based ISA for Tonga PRO GL [FirePro W7100], though notes that OpenCL shows as v1.2 but actually seems to support up to v2.1. The BIOS is set to use CPU-based displays and I have monitors on the DVI and HDMI outputs, with nothing connected to any of the graphics card display ports.

Software: Basic Ubuntu 16.04 with few additions - no other video software or drivers installed, only those that came with ROCm.

ROCm installation: Followed advice and ran sudo apt update, sudo apt dist-upgrade, sudo apt install libnuma-dev, then rebooted before getting rocm.gpg.key, adding rocm repository to /etc/apt/sources.list.d and installing rocm-dkms. Seemed to complete successfully and HelloWorld compiled and ran OK, after adding my username to video group.

HCC installation and build: cloned the hcc.git repository using git clone --recursive -b clang_tot_upgrade into tools folder and built from source using mkdir -p build; cd build, then cmake -DCMAKE_BUILD_TYPE=Release .., then make and make install. May have been some warnings or errors - I wasn't watching the screen all the time as it was quite a lengthy process but it at least completed without any error indication.

HCC-Example-Applications installation: Added /opt/rocm/bin to path, downloaded example files, created build sub-directory and ran CXX=hcc cmake .. from there, then make. This produced too many warnings and errors and stopped with 'Makefile:83: recipe for target 'all' failed'. Then tried to compile each example singly in turn using hcc hcc-config -cxxflags --ldflags example_name.cpp

HCC Examples common compilation issue: For all example applications, got the same output: clang-7: warning: -amdgpu-target argument 'gfx802' is not recognized; using gfx803 instead [- WInvalid-command-line-argument] and then several lines of: 'auto' is not a recognized processor for this target (ignoring processor). This is my main concern right now, that the platform/compiler does not actually support Tonga GPUs and is defaulting to a Fiji device.

HCC Examples other compilation issues: For MD, errors included: a) at line 31:5 in MD.cpp, variable length array declaration not allowed at file scope int dummy[DUMMY_NUM], b) at line 173:48 in hc_short_vector.inl, no type named 'type' in hc::short_vector::vector<float, 4>' typedef typename vector::type type; (further indication also of problem at MD.cpp line 474:15 linked with instantiation of template class MD mdf and similarly later at line 540:20 with instantiation of template classs MD mdd).

For FFT there were various (presumably minor) macro redefinition warnings on constants like M_E but also too many errors to complete compilation, mostly on forward definitions of function calls, probably linked to short_vector/instantiation of template class issue mentioned above in connection with MD.

For BitonicSort there were several warnings of am_copy being deprecated and to use accelerator_view::copy instead but program did compile, however gave linker error with undefined reference to symbol 'hsa_memory_free@@ROCR_1' and libhsa-runtime64.so.1 error adding symbols: DSO missing from command line.

HCC Examples possible runtime issues: Examples SPMV, SyncVsAsyncArrayCopy and ArrayBandwidth (apart from the common clang-7 target warning mentioned above) did all compile, link and generate an executable, which all ran but gave the indication ### HCC STATUS_CHECK Error: HSA_STATUS_ERROR (0x1000) at file:mcwamp_hsa.cpp line:3655 Aborted (core dumped).

P.S. Have done a bit more digging and perhaps found a clue to the GPU issue in HCC2. I installed this and ran the cloc/vector_copy example. This produced the same compiler warnings about unrecognized target and ###HCC STATUS_CHECK error when it ran.

The makefile uses the cloc.sh script which in turn calls mygpu to determine which gfx processor to use and if it cannot find one it defaults to Fiji (gfx803). Mygpu in turn looks for a kfdid in one of the /sys/devices/virtual/kfd/kfd/topology/nodes/*/properties files. There are 26 items of information in Node 1 properties file, one of which gives the same device_id number as ROCminfo shows for Agent 1Chip ID but there is no kfdid parameter.

I read that the ROCm HSA driver replaces the amdkfd driver but ROCm perhaps still relies on information provided by the previous driver.

So if this is indeed the issue, the question is can I put a valid kfdid value into the properties file manually (with a corresponding addition to the kfkid2code() function in mygpu as currently there are only entries for Vega and Fiji) without upsetting anything, or do I need to uninstall everything, install kfd driver then reinstall ROCm?

Any suggestions/comments, please?

vsytch commented 6 years ago

If you want to compile for gfx802 try adding it here https://github.com/RadeonOpenCompute/hcc-clang-upgrade/blob/be6eeeeffb62557a79a6b559b7b633fb29e5d8e4/lib/Driver/ToolChains/Hcc.cpp#L311, though it does not seem to be supported.

JMadgwick commented 6 years ago

As this is still open I think it makes sense to add that in the other issue it was confirmed that ROCm supports Tonga but not in all aspects. HCC does NOT support Tonga. It supports only gfx701,803,900,906. However there is a fix waiting to be merged in which looks like it will add support for Tonga and other gfx which are not currently supported.

DGASUK commented 6 years ago

Thanks for getting back with something definitive on this. It is disappointing, as I bought the FirePro card a few years ago specifically for GPGPU processing. It was one of the few at the time flagged as pending support for OpenCL 2.0, though the driver did take some time to become available.

On Sat, 15 Sep 2018 at 09:30, James Madgwick notifications@github.com wrote:

As this is still open I think it makes sense to add that in the other issue it was confirmed that ROCm supports Tonga but not in all aspects. HCC does NOT support Tonga. It supports only gfx701,803,900,906. However there is a fix waiting to be merged in https://github.com/RadeonOpenCompute/hcc-clang-upgrade/pull/149 which looks like it will add support for Tonga and other gfx which are not currently supported.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/RadeonOpenCompute/hcc/issues/810#issuecomment-421541125, or mute the thread https://github.com/notifications/unsubscribe-auth/AWfsEgyHiv_6GWBUUf_rJ-lY33Blb1qgks5ubLq0gaJpZM4VdCwL .