geerlingguy / top500-benchmark

Automated Top500 benchmark for clusters or single nodes.
MIT License
158 stars 17 forks source link

Benchmark Milk-V Jupiter #37

Closed geerlingguy closed 2 months ago

geerlingguy commented 2 months ago

As the title says...

Similar to https://github.com/geerlingguy/top500-benchmark/issues/35 I will probably run Ansible from a separate computer.

geerlingguy commented 2 months ago

For some reason, trying Ansible from my Mac, I'm getting:

    debug3: sign_and_send_pubkey: signing using ssh-ed25519 [...]
    debug3: send packet: type 50
    Connection closed by 10.0.2.212 port 22

And I can't get it to auth with password or key, on either my jgeerling account or root. I can log in via SSH with both key and password via my normal CLI login, it seems like something server-side is blocking things out maybe?

The auth.log even seems to indicate SSH is happy on the server side, the connection just... drops:

2024-07-20T05:13:18.194612+08:00 milkv-jupiter sshd[7779]: Accepted publickey for root from 10.0.2.15 port 61453 ssh2: ED25519 [...]
2024-07-20T05:13:35.366477+08:00 milkv-jupiter sshd[7785]: Accepted password for jgeerling from 10.0.2.15 port 61454 ssh2
geerlingguy commented 2 months ago

Using my guide for Installing Ansible on a RISC-V computer, I got it up and running. I will have to run this playbook locally, I guess.

geerlingguy commented 2 months ago

Power consumption during the benchmark run is around 10.7W:

Screenshot 2024-07-20 at 11 16 49 PM

Note: I may re-run everything with the same USB-C power supply I use in my Pi 5 / Arm SBC testing, as it should put out 27W at 12V... I think. I like having that consistency if I can manage it!

geerlingguy commented 2 months ago

Run with Ps 1 and Qs 8: 5.6009 Gflops at 10.7W, 0.52 Gflops/W

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   34581
NB     :     256
PMAP   : Row-major process mapping
P      :       1
Q      :       8
PFACT  :   Right
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  1ringM
DEPTH  :       1
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR11C2R4       34581   256     1     8            4922.54             5.6009e+00
HPL_pdgesv() start time Sat Jul 20 22:50:43 2024

HPL_pdgesv() end time   Sun Jul 21 00:12:46 2024

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   2.26617141e-03 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================
geerlingguy commented 2 months ago

7.2W, 3.0838 Gflops, for 0.43 Gflops/W (so a bit less efficient, as would be expected, running with 1/4 Ps/Qs:

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   34581
NB     :     256
PMAP   : Row-major process mapping
P      :       1
Q      :       4
PFACT  :   Right
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  1ringM
DEPTH  :       1
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR11C2R4       34581   256     1     4            8940.58             3.0838e+00
HPL_pdgesv() start time Sun Jul 21 09:39:43 2024

HPL_pdgesv() end time   Sun Jul 21 12:08:44 2024

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   3.03925351e-03 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================
geerlingguy commented 2 months ago

And finally, 2/4 Ps/Qs: 5.6560 Gflops, 10.6 W, for 0.53 Gflops/W

================================================================================
HPLinpack 2.3  --  High-Performance Linpack benchmark  --   December 2, 2018
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Laboratory, UTK
Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK
Modified by Julien Langou, University of Colorado Denver
================================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :   34581
NB     :     256
PMAP   : Row-major process mapping
P      :       2
Q      :       4
PFACT  :   Right
NBMIN  :       4
NDIV   :       2
RFACT  :   Crout
BCAST  :  1ringM
DEPTH  :       1
SWAP   : Mix (threshold = 64)
L1     : transposed form
U      : transposed form
EQUIL  : yes
ALIGN  : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
      ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be               1.110223e-16
- Computational tests pass if scaled residuals are less than                16.0

================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR11C2R4       34581   256     2     4            4874.58             5.6560e+00
HPL_pdgesv() start time Sun Jul 21 14:48:40 2024

HPL_pdgesv() end time   Sun Jul 21 16:09:54 2024

--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=   3.03925351e-03 ...... PASSED
================================================================================

Finished      1 tests with the following results:
              1 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================
geerlingguy commented 2 months ago

It has been done!