ThomasKaiser / sbc-bench

Simple benchmark for single board computers
BSD 3-Clause "New" or "Revised" License
658 stars 78 forks source link

Kendryte K510 error #46

Closed archanox closed 2 years ago

archanox commented 2 years ago
/sbc-bench.sh: line 1572: 3 * 221 / 0 : division by 0 (error token is "0 ")
debian@debian:~$ cat /proc/cpuinfo
hart    : 0
isa : rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0
mmu : sv39

hart    : 1
isa : rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0
mmu : sv39

debian@debian:~$ uname -a
Linux debian 4.17.0 #1 SMP PREEMPT Wed Jun 22 05:46:18 AEST 2022 riscv64 GNU/Linux
archanox commented 2 years ago
Checking cpufreq OPP..../sbc-bench.sh: line 1804: echo: write error: Invalid argument
./sbc-bench.sh: line 1815: [: inf: integer expression expected
./sbc-bench.sh: line 1819: [: inf: integer expression expected
./sbc-bench.sh: line 1804: echo: write error: Invalid argument
./sbc-bench.sh: line 1815: [: inf: integer expression expected
./sbc-bench.sh: line 1819: [: inf: integer expression expected
 Done (results will be available in 44-62 minutes).
archanox commented 2 years ago
Executing 7-zip benchmark..../sbc-bench.sh: line 1931:  8175 Killed                  taskset -c ${ClusterConfig[$i]} "${SevenZip}" b -mmt=1 >> ${TempLog}
./sbc-bench.sh: line 1931:  8361 Killed                  taskset -c ${ClusterConfig[$i]} "${SevenZip}" b -mmt=1 >> ${TempLog}
./sbc-bench.sh: line 1908:  8517 Killed                  "${SevenZip}" b -mmt=${CPUCores} >> ${TempLog}
./sbc-bench.sh: line 1908:  8599 Killed                  "${SevenZip}" b -mmt=${CPUCores} >> ${TempLog}
./sbc-bench.sh: line 1908:  8650 Killed                  "${SevenZip}" b -mmt=${CPUCores} >> ${TempLog}
./sbc-bench.sh: line 1968: 0 / 0 : division by 0 (error token is "0 ")
ThomasKaiser commented 2 years ago

Can you please run this in debug mode? /bin/bash -x /path/to/sbc-bench.sh and provide output via some pasteboard service?

archanox commented 2 years ago

logfile.txt

ThomasKaiser commented 2 years ago

There's at least two problems:

1) the cpufreq implementation seems to be weird:

                     cpufreq   min    max
     CPU    cluster  policy   speed  speed   core type
      0       -1        0        0     792       -
      1       -1        1        0     792       -

There are two cpufreq policies defined (which IMO makes no sense with this type of CPU) and cpuinfo_min_freq is always 0 which also makes zero sense (and triggered the various errors)

2) The board has only 192 MB RAM which is too low for the 7-ZIP benchmark. Without installing zram-tools and tuning /etc/default/zramswap the 7-zip benchmarks will still be oom-killed. And with zram I would believe too much CPU activity for zram will be spent.

Anyway: I tried to fix 1) in most recent commit so please download again and give it another try. If still errors occur (other than 7-zip being oom-killed) then please provide debug output again.

archanox commented 2 years ago

Success!

| canaan-creative,k510 | 792 MHz | 4.17 | Debian GNU/Linux bookworm/sid riscv64 | 0 | 6690 | 7400 | 280 | 440 | - | http://ix.io/41NT |

ThomasKaiser commented 2 years ago

Success!

Well, though the problems seem to have been addressed by the benchmark script all 7-zip benchmark runs have been oom-killed. So as already suggested without installing zram-tools this won't work.

Can you please give these commands a try and report where they stop (with and without zram):

 7zr b -md=2m -mmt=1
 7zr b -md=2m

Or you simply try out latest version which tries to adjust dictionary size based on available memory.

archanox commented 2 years ago

Without zram

debian@debian:~$ 7zr b -md=2m -mmt=1

7-Zip (a) 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,2 CPUs LE)

LE
CPU Freq: 21333333 21333333 32000000 32000000 128000000 128000000 512000000 512000000 1024000000

RAM size:     192 MB,  # CPU hardware threads:   2
RAM usage:     30 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

18:        359   100    320    320  |       5898   100    477    477
19:        345   100    311    310  |       5863   100    480    480
20:        335   100    306    306  |       5819   100    482    482
21:        326   100    306    306  |       5967   100    502    502
----------------------------------  | ------------------------------
Avr:             100    311    310  |              100    485    485
Tot:             100    398    398

debian@debian:~$ 7zr b -md=2m       

7-Zip (a) 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,2 CPUs LE)

LE
CPU Freq: 16000000 21333333 21333333 64000000 128000000 85333333 256000000 512000000 2048000000
RAM size:     192 MB,  # CPU hardware threads:   2
RAM usage:     36 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

18:        500   140    318    446  |      11623   195    481    940
19:        491   144    307    440  |      11535   197    481    945
20:        487   146    305    444  |      11438   196    485    948
21:        460   147    294    432  |      11298   197    483    950
----------------------------------  | ------------------------------
Avr:             144    306    441  |              196    482    946
Tot:             170    394    693
debian@debian:~$
archanox commented 2 years ago

Ah, unfortunately the kernel I'm using for the Kendryte does not support swap files.

ThomasKaiser commented 2 years ago

unfortunately the kernel I'm using for the Kendryte does not support swap files

That's strange but anyway. Latest commit should allow the 7-zip benchmark to finish (since checking for low RAM and then choosing smaller dictionary sizes). Please give it a try with latest sbc-bench version again and please also provide output from:

7zr b -mmt=1
7zr b

Just to see where it stops due to being oom-killed and which numbers are generated since I would assume choosing smaller dictionary sizes results in slightly better scores.

archanox commented 2 years ago

debian@debian:~$ 7zr b -mmt=1

7-Zip (a) 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,2 CPUs LE)

LE
CPU Freq: 16000000 32000000 64000000 64000000 64000000 85333333 256000000 1024000000 1024000000

RAM size:     192 MB,  # CPU hardware threads:   2
RAM usage:    111 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:        331   100    322    322  |       5848   100    499    499
23:        305   100    312    312  |       5668   100    491    491
----------------------------------  | ------------------------------
Avr:             100    317    317  |              100    495    495
Tot:             100    406    406
debian@debian:~$ 7zr b       

7-Zip (a) 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,2 CPUs LE)

LE
CPU Freq: 21333333 32000000 32000000 32000000 128000000 128000000 512000000 512000000 2048000000

RAM size:     192 MB,  # CPU hardware threads:   2
RAM usage:    117 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:        454   148    299    442  |      11294   199    485    964
23:        429   150    291    438  |      10983   199    479    951
----------------------------------  | ------------------------------
Avr:             149    295    440  |              199    482    958
Tot:             174    388    699
debian@debian:~$  
ThomasKaiser commented 2 years ago

OK, taking your last benchmark and comparing the numbers:

Single-threaded:

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

18:        365   100    326    326  |       5871   100    475    475
19:        350   100    314    314  |       5888   100    484    482
20:        344   100    314    314  |       6005   100    498    498
21:        337   100    317    316  |       5843   100    492    491
----------------------------------  | ------------------------------
Avr:             100    318    318  |              100    487    487
Tot:             100    402    402

With larger dictionaries (at least 2^22 and 2^23) we would've been at

22:        331   100    322    322  |       5848   100    499    499
23:        305   100    312    312  |       5668   100    491    491
----------------------------------  | ------------------------------
Avr:             100    317    317  |              100    495    495
Tot:             100    406    406

That's close enough since ~1% difference when taking the smaller dicts.

Multi-threaded:

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

18:        506   140    322    451  |      11703   197    480    946
19:        494   142    311    443  |      11785   199    485    965
20:        457   141    296    417  |      11387   196    482    944
21:        475   146    305    446  |      11155   198    475    938
----------------------------------  | ------------------------------
Avr:             142    309    439  |              197    480    948
Tot:             170    394    694

With larger dictionaries (at least 2^22 and 2^23) we would've been at

22:        454   148    299    442  |      11294   199    485    964
23:        429   150    291    438  |      10983   199    479    951
----------------------------------  | ------------------------------
Avr:             149    295    440  |              199    482    957
Tot:             174    388    693

Almost identical. Thank you for your help, now sbc-bench can also provide numbers from systems low on memory. Though I've to test on x86, arm and arm64 too.

archanox commented 2 years ago

debian@debian:~$ sudo /bin/bash ./sbc-bench.sh
[sudo] password for debian: 
WARNING: This tool is meant to run only on Debian Stretch, Buster, Bullseye or Ubuntu Bionic, Focal, Jammy.

When executed on Debian GNU/Linux bookworm/sid results are partially meaningless.
Press [ctrl]-[c] to stop or [enter] to continue.
WARNING: dmesg output does not contain early boot messages which
help in identifying hardware details.

It is recommended to reboot now and then execute the benchmarks.
Press [ctrl]-[c] to stop or [enter] to continue.

sbc-bench v0.9.8

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 42-61 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP. Done (48 minutes elapsed).

ATTENTION: Due to overheating prevention 2 CPU cores have been killed. Check the log for details.

Memory performance (different CPU cores measured individually):
memcpy: 251.2 MB/s 
memset: 439.4 MB/s 
memcpy: 283.5 MB/s 
memset: 439.0 MB/s 

7-zip total scores (3 consecutive runs): 692,698,697

OpenSSL results (different CPU cores measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc       6389.00k     8620.03k     9278.21k     9469.95k     9524.57k     9147.73k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-128-cbc       6698.89k     8618.73k     9287.25k     9473.02k     9177.77k     9524.57k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-192-cbc       5990.51k     7560.17k     8086.53k     8229.89k     7970.82k     8257.54k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-192-cbc       5996.09k     7563.43k     8089.86k     7986.86k     8402.26k     8410.45k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-256-cbc       5518.11k     6776.36k     6914.30k     7276.89k     7326.38k     7334.57k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-256-cbc       5612.18k     6868.59k     7002.28k     7389.53k     7424.68k     7234.00k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)
Full results uploaded to http://ix.io/41QG. 
debian@debian:~$
ThomasKaiser commented 2 years ago

I tried to fix the cosmetical issues and also the behaviour to treat each CPU core as an own cluster with an own cpufreq policy. If time permits please test again with latest version (latest commit since a small bug went in with above commit)...

archanox commented 2 years ago
debian@debian:~$ sudo /bin/bash ./sbc-bench.sh
[sudo] password for debian: 
WARNING: This tool is meant to run only on Debian Stretch, Buster, Bullseye or Ubuntu Bionic, Focal, Jammy.

When executed on Debian GNU/Linux bookworm/sid results are partially meaningless.
Press [ctrl]-[c] to stop or [enter] to continue.
WARNING: dmesg output does not contain early boot messages which
help in identifying hardware details.

It is recommended to reboot now and then execute the benchmarks.
Press [ctrl]-[c] to stop or [enter] to continue.

Average load and/or CPU utilization too high (too much background activity). Waiting...

Too busy for benchmarking: 22:56:00 up 4 min,  3 users,  load average: 0.18, 0.36, 0.18,  cpu: 0.19%
Too busy for benchmarking: 22:56:05 up 4 min,  3 users,  load average: 0.16, 0.35, 0.17,  cpu: 0.18%
Too busy for benchmarking: 22:56:10 up 4 min,  3 users,  load average: 0.15, 0.35, 0.17,  cpu: 0.16%
Too busy for benchmarking: 22:56:15 up 4 min,  3 users,  load average: 0.14, 0.34, 0.17,  cpu: 0.15%
Too busy for benchmarking: 22:56:20 up 4 min,  3 users,  load average: 0.21, 0.35, 0.18,  cpu: 0.14%
Too busy for benchmarking: 22:56:25 up 4 min,  3 users,  load average: 0.19, 0.35, 0.18,  cpu: 0.19%

sbc-bench v0.9.8

Installing needed tools: Done.
Checking cpufreq OPP. Done (results will be available in 46-66 minutes).
Executing tinymembench. Done.
Executing RAM latency tester. Done.
Executing OpenSSL benchmark. Done.
Executing 7-zip benchmark. Done.
Checking cpufreq OPP. Done (48 minutes elapsed).

ATTENTION: Due to overheating prevention 2 CPU cores have been killed. Check the log for details.

Memory performance (different CPU cores measured individually):
memcpy: 262.7 MB/s 
memset: 438.4 MB/s 
memcpy: 283.0 MB/s 
memset: 438.8 MB/s 

7-zip total scores (3 consecutive runs): 693,682,695

OpenSSL results (different CPU cores measured individually):
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc       6359.63k     8728.47k     9436.76k     9624.23k     9680.21k     9317.03k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)
aes-128-cbc       6697.94k     8622.89k     9288.53k     9466.54k     9177.77k     9513.64k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-192-cbc       5984.21k     7558.68k     8073.56k     8228.18k     7968.09k     8273.92k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)
aes-192-cbc       6019.96k     7562.97k     8089.77k     7974.91k     8368.90k     8383.15k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)aes-256-cbc       5521.18k     6775.53k     6913.79k     7289.86k     7318.19k     7323.65k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)
aes-256-cbc       5526.97k     6778.65k     6914.22k     7294.29k     7329.11k     7329.11k (rv64i2p0m2p0a2p0f2p0d2p0c2p0xv5-0p0)
Full results uploaded to http://ix.io/41XH. Please check the log for anomalies (e.g. swappingor throttling happenend).

debian@debian:~$