krrishnarraj / clpeak

A tool which profiles OpenCL devices to find their peak capacities
Apache License 2.0
396 stars 111 forks source link

clpeak does not work with POCL #47

Closed znmeb closed 5 years ago

znmeb commented 6 years ago

Platform: Portable Computing Language
  Device: pthread-AMD FX(tm)-8350 Eight-Core Processor
    Driver version  : 1.1-pre (Linux x64)
    Compute units   : 8
    Clock frequency : 4000 MHz

    Global memory bandwidth (GBPS)
      float   : clEnqueueNDRangeKernel (-63)
      Tests skipped

    Single-precision compute (GFLOPS)
clCreateBuffer (-61)
      Tests skipped

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
clCreateBuffer (-61)
      Tests skipped

    Integer compute (GIOPS)
clCreateBuffer (-61)
      Tests skipped

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 0.00
      enqueueReadBuffer          : 0.00
      enqueueMapBuffer(for read) : 0.00
        memcpy from mapped ptr   : inf
      enqueueUnmap(after write)  : 0.00
        memcpy to mapped ptr     : inf

    Kernel launch latency : 

clpeak stops responding after printing "Kernel launch latency:"

See https://github.com/pocl/pocl/issues/522; they seem to think it's a clpeak issue.

dbeurle commented 5 years ago

I can confirm something along these lines with my Ryzen 2700:

Platform: Portable Computing Language
  Device: pthread-AMD Ryzen 7 2700 Eight-Core Processor
    Driver version  : 1.2 (Linux x64)
    Compute units   : 16
    Clock frequency : 3200 MHz

    Global memory bandwidth (GBPS)
      float   : clEnqueueNDRangeKernel (-63)
      Tests skipped

    Single-precision compute (GFLOPS)
clCreateBuffer (-61)
      Tests skipped

    No half precision support! Skipped

    Double-precision compute (GFLOPS)
clCreateBuffer (-61)
      Tests skipped

    Integer compute (GIOPS)
clCreateBuffer (-61)
      Tests skipped

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 0.00
      enqueueReadBuffer          : 0.00
      enqueueMapBuffer(for read) : 0.00
        memcpy from mapped ptr   : inf
      enqueueUnmap(after write)  : 0.00
        memcpy to mapped ptr     : inf

    Kernel launch latency : 16.85 us