krrishnarraj / clpeak

A tool which profiles OpenCL devices to find their peak capacities
Apache License 2.0
386 stars 109 forks source link

results for AMD Ryzen 7950X integrated graphics (Raphael) in Windows 11 #107

Open moyang opened 1 year ago

moyang commented 1 year ago
Platform: AMD Accelerated Parallel Processing
  Device: gfx1036
    Driver version  : 3516.0 (PAL,LC) (Win64)
    Compute units   : 1
    Clock frequency : 2200 MHz

    Global memory bandwidth (GBPS)
      float   : 51.73
      float2  : 58.41
      float4  : 59.40
      float8  : 54.55
      float16 : 55.38

    Single-precision compute (GFLOPS)
      float   : 563.63
      float2  : 546.60
      float4  : 545.07
      float8  : 541.92
      float16 : 535.95

    Half-precision compute (GFLOPS)
      half   : 563.71
      half2  : 1112.40
      half4  : 1105.64
      half8  : 1078.05
      half16 : 1068.91

    Double-precision compute (GFLOPS)
      double   : 35.49
      double2  : 35.44
      double4  : 35.37
      double8  : 35.21
      double16 : 34.93

    Integer compute (GIOPS)
      int   : 113.37
      int2  : 113.29
      int4  : 113.10
      int8  : 112.77
      int16 : 112.20

    Integer compute Fast 24bit (GIOPS)
      int   : 557.82
      int2  : 556.65
      int4  : 553.41
      int8  : 546.53
      int16 : 494.13

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 26.64
      enqueueReadBuffer               : 27.86
      enqueueWriteBuffer non-blocking : 26.07
      enqueueReadBuffer non-blocking  : 27.66
      enqueueMapBuffer(for read)      : 1022611.31
        memcpy from mapped ptr        : 27.79
      enqueueUnmap(after write)       : 6135667.50
        memcpy to mapped ptr          : 27.25

    Kernel launch latency : 38.55 us