geerlingguy / top500-benchmark

Automated Top500 benchmark for clusters or single nodes.
MIT License
158 stars 17 forks source link

Benchmark Radxa X4 (Intel N100) #38

Closed geerlingguy closed 1 month ago

geerlingguy commented 1 month ago

As the title says...

geerlingguy commented 1 month ago

With defaults in Ubuntu 24.04 using the official Heatsink/Fan/Case (which seems to not be able to prevent throttling in my ccurrent configuration—see TODO), I am getting:

37.224 Gflops at 16W (which settles in at 75°C temps, around 2 GHz), for 2.33 Gflops/W

HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   23314
NB     :     256
PMAP   : Row-major process mapping
P      :       1
Q      :       4
PFACT  :   Right
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  1ringM
DEPTH  :       1
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words


- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

T/V                N    NB     P     Q               Time                 Gflops
WR11C2R4       23314   256     1     4             226.97             3.7224e+01
HPL_pdgesv() start time Tue Jul 30 17:08:25 2024

HPL_pdgesv() end time   Tue Jul 30 17:12:12 2024

||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   3.29349238e-03 ...... PASSED

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.

End of Tests.
geerlingguy commented 1 month ago

Closing for now. If I can find a better way to cool the chip so it doesn't throttle down quickly, I'll do so and re-run the benches.

geerlingguy commented 1 month ago

Another test using Noctua thermal paste instead of a thermal pad: 37.531 Gflops at 15W, for 2.50 Gflops/W

    HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
    Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
    Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
    Modified by Julien Langou, University of Colorado Denver

    An explanation of the input/output parameters follows:
    T/V    : Wall time / encoded variant.
    N      : The order of the coefficient matrix A.
    NB     : The partitioning blocking factor.
    P      : The number of process rows.
    Q      : The number of process columns.
    Time   : Time in seconds to solve the linear system.
    Gflops : Rate of execution for solving the linear system.

    The following parameter values will be used:

    N      :   23314
    NB     :     256
    PMAP   : Row-major process mapping
    P      :       1
    Q      :       4
    PFACT  :   Right
    NBMIN  :       4
    NDIV   :       2
    RFACT  :   Crout
    BCAST  :  1ringM
    DEPTH  :       1
    SWAP   : Mix (threshold = 64)
    L1     : transposed form
    U      : transposed form
    EQUIL  : yes
    ALIGN  : 8 double precision words


    - The matrix A is randomly generated for each test.
    - The following scaled residual check will be computed:
          ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
    - The relative machine precision (eps) is taken to be               1.110223e-16
    - Computational tests pass if scaled residuals are less than                16.0

    T/V                N    NB     P     Q               Time                 Gflops
    WR11C2R4       23314   256     1     4             225.12             3.7531e+01
    HPL_pdgesv() start time Wed Jul 31 11:34:43 2024

    HPL_pdgesv() end time   Wed Jul 31 11:38:28 2024

    ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   3.29349238e-03 ...... PASSED

    Finished      1 tests with the following results:
                  1 tests completed and passed residual checks,
                  0 tests completed and failed residual checks,
                  0 tests skipped because of illegal input values.

    End of Tests.