etaler / Etaler

A flexable HTM (Hierarchical Temporal Memory) framework with full GPU support.
BSD 3-Clause "New" or "Revised" License
89 stars 14 forks source link

AMD GPU support over Mesa + Clover #157

Closed marty1885 closed 3 years ago

marty1885 commented 3 years ago

Just opening a issue so people can find the information later on. I just got my AMD RX 570 GPU (on my ARM server) working along with OpenCL support via Clover (aka GalliumCompute). Clover aims to build OpenCL support over the open source Mesa/Gallium GPU framework.

Anyway, Etaler works and passes all tests on Clover. Yay!!

marty1885 commented 3 years ago

Note for my self:

The performance is like, very bad. But it works never the less. Looks like a mix of driver latency, compiler capability and GPU performance issues.

Benchmarking TemporalMemory algorithm on backend: OpenCL on Radeon RX 570 Series (POLARIS10, DRM 3.35.0, 5.4.3-00006-gf671e064721e, LLVM 10.0.1) 

64 bits per SDR, 1.49374ms per forward
128 bits per SDR, 2.48617ms per forward
512 bits per SDR, 7.16544ms per forward
1024 bits per SDR, 13.2979ms per forward
8192 bits per SDR, 106.193ms per forward

Comparing to the CPU (LX2160A, 16 core ARM)

Benchmarking TemporalMemory algorithm on backend: CPU 

64 bits per SDR, 0.563107ms per forward
128 bits per SDR, 1.21684ms per forward
512 bits per SDR, 2.91978ms per forward
1024 bits per SDR, 6.09122ms per forward
8192 bits per SDR, 45.3838ms per forward

To be fair, this is the first time Etaler running on a AMD GPU (And I want to be as vendor neutral as possible). I never optimized the code for it. And the performance drop from 1024 to 8192 bits indicates AMD GPUs heavily rely on good local memory access to gain performance. That's going to be a issue down the line.