Closed solardiz closed 5 years ago
For example, md5crypt-opencl spends significant time on transferring candidates from host to device.
Device 2: Tahiti [AMD Radeon HD 7900 Series]
Benchmarking: md5crypt-opencl, crypt(3) $1$ [MD5 OpenCL]... DONE
Warning: "Many salts" test limited: 8/256
Many salts: 3919K c/s real, 209715K c/s virtual
Only one salt: 3679K c/s real, 52428K c/s virtual
If I script this, what is a sensible definition of "significant"? One percent? Five?
I think 1% on an otherwise idle system, non-OpenMP (the "noise" with OpenMP is in excess of 1%). For the md5crypt-opencl example above, I saw a 1% difference on the old GTX 570, but as you can see that difference increases to several percent on newer (or not as old and small) GPUs. So 1% on one system may mean a more significant difference on another.
OK, I'll give it a shot
I'll hack bench.c
with format->params.benchmark_length &= ~0x100;
and test all formats, then process the output. I'll ignore dynamics. I will test OpenCL but only using CPU device. I'll run --test=-1
for ensuring "many salts" complete and I'll use a non-omp build.
diff --git a/src/bench.c b/src/bench.c
index c5535d74f..6edee16ae 100644
--- a/src/bench.c
+++ b/src/bench.c
@@ -719,6 +719,11 @@ AGAIN:
if (!format->params.tests && format != fmt_list)
continue;
+/* Hack for scripting raw/one/many tests, only test salted formats */
+ format->params.benchmark_length &= ~0x100;
+ if (!format->params.salt_size)
+ continue;
+
/* Just test the encoding-aware formats if --encoding was used explicitly */
if (!options.default_enc && options.target_enc != ASCII &&
options.target_enc != ISO_8859_1 &&
#!/usr/bin/perl -w
use warnings;
use strict;
my $format = "";
my $many = 0;
my $one = 0;
my $perc = 0;
while (<>) {
if (m/^Benchmarking:\s*([^, ]+)/) {
$format = $1;
}
if (m/^Many salts:\s*([0-9\.]+)([KMG])?/) {
$many = $1;
if (defined $2) {
$many *= 1000 if $2 eq "K";
$many *= 1000000 if $2 eq "M";
$many *= 1000000000 if $2 eq "G";
}
} elsif (m/^Only one salt:\s*([0-9\.]+)([KMG])?/) {
$one = $1;
if (defined $2) {
$one *= 1000 if $2 eq "K";
$one *= 1000000 if $2 eq "M";
$one *= 1000000000 if $2 eq "G";
}
my $cur = `../run/john -form=$format -list=format-details 2>/dev/null | cut -f 10`;
next if !$cur; # dynamic
chomp $cur;
$cur = hex($cur);
$perc = $many * 100 / $one;
if ($many < $one && !($cur & 0x100)) {
$cur |= 0x100;
printf("%s has slower many-salts! %s vs. %s, set to 0x%x\n", $format, $many, $one, $cur);
} elsif ($many == $one && !($cur & 0x100)) {
$cur |= 0x100;
printf("%s has same one/many figure: %s c/s, set to 0x%x\n", $format, $many, $cur);
} elsif (($perc >= 101) && ($cur & 0x100)) {
$cur &= ~0x100;
printf("%s has sufficient boost: %.1f%%, set to 0x%x\n", $format, $perc, $cur);
} elsif (($perc < 101) && !($cur & 0x100)) {
$cur |= 0x100;
printf("%s has insufficient boost: %.1f%%, set to 0x%x\n", $format, $perc, $cur);
}
$format = "";
$many = $one = $perc = 0;
}
}
Here's the output. I can't test OpenCL on my laptop, it takes too long (using --test=-1
)
All these "should" be changed. Some core formats included. Perhaps go for 2%, or 1.5?
Currently not set as raw:
bitshares has insufficient boost: 100.5%, set to 0x107
dominosec8 has insufficient boost: 100.4%, set to 0x107
Currently set as raw:
md5crypt has sufficient boost: 102.3%, set to 0x7
bcrypt has sufficient boost: 101.1%, set to 0x7
BestCrypt has sufficient boost: 106.1%, set to 0x7
BKS has sufficient boost: 108.7%, set to 0x7
Blockchain has sufficient boost: 101.5%, set to 0x7
chap has sufficient boost: 104.6%, set to 0x7
Clipperz has sufficient boost: 101.4%, set to 0x7
sha1crypt has sufficient boost: 123.3%, set to 0x7
Django has sufficient boost: 101.3%, set to 0x7
dmd5 has sufficient boost: 101.1%, set to 0x7
dmg has sufficient boost: 141.1%, set to 0x7
gost has sufficient boost: 101.9%, set to 0x7
IKE has sufficient boost: 102.2%, set to 0x7
KeePass has sufficient boost: 176.6%, set to 0x7
keychain has sufficient boost: 101.3%, set to 0x7
keyring has sufficient boost: 110.4%, set to 0x7
krb4 has sufficient boost: 101.1%, set to 0x7
krb5-18 has sufficient boost: 101.4%, set to 0x7
krb5-3 has sufficient boost: 110.8%, set to 0x7
leet has sufficient boost: 102.5%, set to 0x7
money has sufficient boost: 102.3%, set to 0x7
mysqlna has sufficient boost: 103.7%, set to 0x7
nk has sufficient boost: 102.0%, set to 0x7
notes has sufficient boost: 101.8%, set to 0x7
o5logon has sufficient boost: 103.2%, set to 0x7
openssl-enc has sufficient boost: 101.0%, set to 0x7
oracle has sufficient boost: 119.0%, set to 0x7
PBKDF2-HMAC-MD5 has sufficient boost: 101.3%, set to 0x7
PBKDF2-HMAC-SHA1 has sufficient boost: 101.7%, set to 0x7
pfx has sufficient boost: 129.8%, set to 0x7
phpass has sufficient boost: 102.9%, set to 0x7
pomelo has sufficient boost: 105.5%, set to 0x7
SSHA512 has sufficient boost: 104.1%, set to 0x7
securezip has sufficient boost: 106.0%, set to 0x7
aix-smd5 has sufficient boost: 101.7%, set to 0x7
vdi has sufficient boost: 101.9%, set to 0x7
OpenVMS has sufficient boost: 142.6%, set to 0x7
ZIP has sufficient boost: 101.5%, set to 0x7
Slower many-salts (usually non-hashes that should be raw)
bsdicrypt has slower many-salts! 399714 vs. 404480, set to 0x107
AndroidBackup has slower many-salts! 1317 vs. 1349, set to 0x107
andOTP has slower many-salts! 116740 vs. 292407, set to 0x107
ansible has slower many-salts! 1137 vs. 1172, set to 0x107
argon2 has slower many-salts! 175 vs. 180, set to 0x107
as400-des has slower many-salts! 4148000 vs. 4270000, set to 0x107
AzureAD has slower many-salts! 91232 vs. 92064, set to 0x107
Bitcoin has slower many-salts! 42.6 vs. 50.4, set to 0x107
Bitwarden has slower many-salts! 2270 vs. 2327, set to 0x107
diskcryptor has slower many-salts! 4435 vs. 4464, set to 0x107
EncFS has slower many-salts! 75.7 vs. 76.9, set to 0x107
gpg has slower many-salts! 12497 vs. 12766, set to 0x107
keystore has slower many-salts! 1586000 vs. 1627000, set to 0x107
known_hosts has slower many-salts! 4429000 vs. 4510000, set to 0x107
monero has slower many-salts! 4.5 vs. 4.7, set to 0x107
Mozilla has slower many-salts! 515168 vs. 526336, set to 0x107
mscash2 has slower many-salts! 2712 vs. 2736, set to 0x107
nsec3 has slower many-salts! 88600 vs. 89236, set to 0x107
radius has slower many-salts! 15947000 vs. 52426000, set to 0x107
SNMP has slower many-salts! 188 vs. 281, set to 0x107
telegram has slower many-salts! 959 vs. 963, set to 0x107
tezos has slower many-salts! 2167 vs. 2192, set to 0x107
wpapsk has slower many-salts! 3330 vs. 3392, set to 0x108
wpapsk-pmk has slower many-salts! 1030000 vs. 1045000, set to 0x140
I think bcrypt
and bsdicrypt
should be left as-is. For md5crypt
, I'm not so sure - maybe jumbo's is different, but core's has no reason to be of significantly different speed by salt count.
We need to investigate radius
- there's probably some trivial optimization to make to it.
Suggested fix for many of these (esp. non hash): check cost of the two benchmark salts, if not same revert to raw
Out of the "Slower many-salts", we already took care of radius
. I've confirmed that these also have the problem, and need work potentially beyond cosmetic/benchmark changes: andOTP
, Bitcoin
, SNMP
. The rest on that list have small performance differences; I tested a few and wasn't able to reliably reproduce the "Slower many-salts" behavior.
andOTP's first two test vectors differ in speed a lot, presumably due to their different ciphertext length. We don't treat this as a tunable cost, so simply switching to Raw (0x107) doesn't result in reporting of two different tunable costs. Switching to 0x507 locks us to the first test vector only, and I think makes more sense in this case - we'll just have speed reporting for a certain one ciphertext length.
For Bitcoin, we could do:
Speed for cost 1 (iteration count) of 177864 and 258507
But there's little point in that. It's cleaner to report a speed for one of these costs. So:
+++ b/src/bench.c
@@ -473,7 +473,7 @@ char *benchmark_format(struct fmt_main *format, int salts,
format->methods.tunable_cost_value[i] != NULL; i++) {
char msg[MAX_COST_MSG_LEN];
- if (t_cost[0][i] == t_cost[1][i])
+ if (t_cost[0][i] == t_cost[1][i] || salts <= 1)
snprintf(msg, sizeof(msg), "cost %d (%s) of %u", i + 1,
format->params.tunable_cost_name[i],
t_cost[0][i]);
+++ b/src/bitcoin_fmt_plug.c
@@ -63,7 +63,7 @@ john_register_one(&fmt_bitcoin);
#endif
#define BENCHMARK_COMMENT ""
-#define BENCHMARK_LENGTH 7
+#define BENCHMARK_LENGTH 0x507
#define PLAINTEXT_LENGTH 125
#define BINARY_SIZE 0
#define BINARY_ALIGN 1
"SNMP" supports 3 different algorithms, but doesn't report tunable costs. Also, I seem to have just doubled its performance by avoiding modulo division.
Also, "SNMP" allows for actual very good speedup for "Many salts", which the current code doesn't exploit. It can cache authKey
for the MD5 and SHA-1 variations of the algorithm if/as they're first computed and then reuse them for further salts.
I took care of the issues identified by magnum above, and some more, in #3873. Now we also need to take care of the rest of OpenCL formats.
I can take that (using super or well) - or would you like me to concentrate on something else? I should be able to work with Jumbo 10 hours a day or so from now on.
@magnumripper It'd be great if you complete the OpenCL part of this issue tomorrow, if you're available.
It could also be a good idea to rerun your test on "super". I was using 31 threads on "super" to confirm/disprove these issues (and when in doubt, I also ran some tests using 256 threads on a Knights Landing), but perhaps running your original test on "super" would identify more potential issues due to the higher thread count than your laptop's. I also used the Titan X for testing of the counterpart OpenCL formats. ("super" currently has one CPU core and GTX 1080 busy, running a lengthy job.)
Thank you!
Started overnight tests using Titan X and 31 threads (affinity 0-30 - and I saw the other process moved to 31). I merged the fix-3795
branch first.
If now re-opened #3091 results in any changes, I'll try to remember to re-test affected formats for this issue again afterwards.
Changed threshold to 2% here.
Currently not set as raw (see https://github.com/magnumripper/JohnTheRipper/issues/3795#issuecomment-482585867):
md5crypt has insufficient boost: 101.0%, change to 0x107
Clipperz has insufficient boost: 100.7%, change to 0x107
Currently set as raw:
aix-ssha1 has sufficient boost: 108.0%, change to 7
aix-ssha256 has sufficient boost: 104.2%, change to 7
aix-ssha512 has sufficient boost: 102.5%, change to 7
ansible has sufficient boost: 102.8%, change to 7
BestCrypt has sufficient boost: 124.1%, change to 7
BKS has sufficient boost: 108.1%, change to 7
Blackberry-ES10 has sufficient boost: 102.7%, change to 7
WoWSRP has sufficient boost: 104.5%, change to 8
sha1crypt has sufficient boost: 124.3%, change to 7
dmg has sufficient boost: 127.1%, change to 7
DPAPImk has sufficient boost: 102.8%, change to 7
eCryptfs has sufficient boost: 120.3%, change to 7
gost has sufficient boost: 167.2%, change to 7
KeePass has sufficient boost: 180.1%, change to 7
keyring has sufficient boost: 108.1%, change to 7
krb5-17 has sufficient boost: 103.4%, change to 7
lpcli has sufficient boost: 105.5%, change to 7
lotus85 has sufficient boost: 111.3%, change to 7
MongoDB has sufficient boost: 111.1%, change to 7
mscash2 has sufficient boost: 104.9%, change to 7
o10glogon has sufficient boost: 126.5%, change to 7
o3logon has sufficient boost: 114.8%, change to 7
PBKDF2-HMAC-SHA256 has sufficient boost: 102.4%, change to 7
pfx has sufficient boost: 131.6%, change to 7
pgpwde has sufficient boost: 102.5%, change to 7
postgres has sufficient boost: 125.5%, change to 7
solarwinds has sufficient boost: 110.4%, change to 7
telegram has sufficient boost: 102.1%, change to 7
vmx has sufficient boost: 103.6%, change to 7
Slower many-salts
saph has slower many-salts! 285339 vs. 287680, change to 0x107
LastPass has slower many-salts! 128909 vs. 130944, change to 0x107
sspr has slower many-salts! 862 vs. 972, change to 0x107
What did you run these tests on and with what settings?
IIRC, I tested md5crypt and Clipperz on KNL to confirm there's significant difference in Many vs. Only one salt with high thread count. Of course, there's almost no difference with few threads.
These tests are on super, 31 threads. OpenCL still testing due to descrypt acting up.
Benchmarking: md5crypt, crypt(3) $1$ [MD5 128/128 AVX 4x3]... (31xOMP) DONE
Many salts: 592224 c/s real, 19398 c/s virtual
Only one salt: 586272 c/s real, 19165 c/s virtual
Benchmarking: Clipperz, SRP [SHA256 32/64 GMP-exp]... (31xOMP) DONE
Many salts: 569878 c/s real, 18527 c/s virtual
Only one salt: 565734 c/s real, 18509 c/s virtual
I've just re-tested on KNL. For md5crypt there's consistent significant speedup for Many salts. For clipperz the results are inconsistent, so we might want to revert that one to 0x107.
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 256 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (256xOMP) DONE
Warning: "Many salts" test limited: 43/256
Many salts: 2072K c/s real, 9214 c/s virtual
Only one salt: 1879K c/s real, 9237 c/s virtual
[solar@localhost run]$ OMP_NUM_THREADS=128 GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 128 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (128xOMP) DONE
Warning: "Many salts" test limited: 84/256
Many salts: 2064K c/s real, 17600 c/s virtual
Only one salt: 1776K c/s real, 17514 c/s virtual
[solar@localhost run]$ OMP_NUM_THREADS=192 GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 192 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (192xOMP) DONE
Warning: "Many salts" test limited: 57/256
Many salts: 2080K c/s real, 11954 c/s virtual
Only one salt: 1843K c/s real, 11902 c/s virtual
[solar@localhost run]$ OMP_NUM_THREADS=192 GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 192 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (192xOMP) DONE
Warning: "Many salts" test limited: 57/256
Many salts: 2060K c/s real, 11967 c/s virtual
Only one salt: 1843K c/s real, 11887 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 256 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (256xOMP) DONE
Warning: "Many salts" test limited: 42/256
Many salts: 2043K c/s real, 9192 c/s virtual
Only one salt: 1879K c/s real, 9228 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=md5crypt
Will run 256 OpenMP threads
Benchmarking: md5crypt, crypt(3) $1$ [MD5 512/512 AVX512F 16x3]... (256xOMP) DONE
Warning: "Many salts" test limited: 43/256
Many salts: 2072K c/s real, 9204 c/s virtual
Only one salt: 1879K c/s real, 9194 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=clipperz
Will run 256 OpenMP threads
Benchmarking: Clipperz, SRP [SHA256 32/64 GMP-exp]... (256xOMP) DONE
Warning: "Many salts" test limited: 5/256
Many salts: 541619 c/s real, 2279 c/s virtual
Only one salt: 528516 c/s real, 2275 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=clipperz
Will run 256 OpenMP threads
Benchmarking: Clipperz, SRP [SHA256 32/64 GMP-exp]... (256xOMP) DONE
Warning: "Many salts" test limited: 5/256
Many salts: 537180 c/s real, 2281 c/s virtual
Only one salt: 541619 c/s real, 2276 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=clipperz
Will run 256 OpenMP threads
Benchmarking: Clipperz, SRP [SHA256 32/64 GMP-exp]... (256xOMP) DONE
Warning: "Many salts" test limited: 5/256
Many salts: 537180 c/s real, 2280 c/s virtual
Only one salt: 537180 c/s real, 2276 c/s virtual
[solar@localhost run]$ GOMP_CPU_AFFINITY=0-255 GOMP_SPINCOUNT=10000 ../run/john -test -form=clipperz
Will run 256 OpenMP threads
Benchmarking: Clipperz, SRP [SHA256 32/64 GMP-exp]... (256xOMP) DONE
Warning: "Many salts" test limited: 5/256
Many salts: 537180 c/s real, 2280 c/s virtual
Only one salt: 546133 c/s real, 2274 c/s virtual
Added a different-cost warning to the script. New output, including OpenCL.
31 threads and/or Titan X, ran with GOMP_CPU_AFFINITY
but without GOMP_SPINCOUNT
.
AndroidBackup has insufficient boost (different costs): 101.5%, change to 0x107
as400-des has insufficient boost: 101.2%, change to 0x107
Bitwarden has insufficient boost (different costs): 101.0%, change to 0x107
diskcryptor has insufficient boost (different costs): 100.5%, change to 0x107
dominosec8 has insufficient boost: 100.7%, change to 0x107
EncFS has insufficient boost (different costs): 100.9%, change to 0x107
nsec3 has insufficient boost: 100.0%, change to 0x107
tezos has insufficient boost (different costs): 100.5%, change to 0x107
wpapsk has insufficient boost (different costs): 100.6%, change to 0x108
ansible-opencl has insufficient boost (different costs): 101.2%, change to 0x107
bitwarden-opencl has insufficient boost (different costs): 100.4%, change to 0x107
mscash2-opencl has insufficient boost: 100.7%, change to 0x107
SL3-opencl has insufficient boost: 100.2%, change to 0x10f
wpapsk-opencl has insufficient boost (different costs): 101.1%, change to 0x108
aix-ssha1 has sufficient boost (different costs): 108.0%, change to 7
aix-ssha256 has sufficient boost (different costs): 104.2%, change to 7
aix-ssha512 has sufficient boost (different costs): 102.5%, change to 7
BestCrypt has sufficient boost (different costs): 124.1%, change to 7
BKS has sufficient boost: 108.1%, change to 7
Blackberry-ES10 has sufficient boost: 102.7%, change to 7
WoWSRP has sufficient boost: 104.5%, change to 8
chap has sufficient boost: 185.7%, change to 7
sha1crypt has sufficient boost (different costs): 124.3%, change to 7
dmd5 has sufficient boost: 132.0%, change to 7
dmg has sufficient boost (different costs): 127.1%, change to 7
DPAPImk has sufficient boost (different costs): 102.8%, change to 7
eCryptfs has sufficient boost: 120.3%, change to 7
gost has sufficient boost: 167.2%, change to 7
KeePass has sufficient boost (different costs): 180.1%, change to 7
keyring has sufficient boost (different costs): 108.1%, change to 7
krb5-17 has sufficient boost: 103.4%, change to 7
krb5-3 has sufficient boost: 132.0%, change to 7
lpcli has sufficient boost (different costs): 105.5%, change to 7
leet has sufficient boost: 122.6%, change to 7
lotus85 has sufficient boost: 111.3%, change to 7
money has sufficient boost: 189.3%, change to 7
MongoDB has sufficient boost (different costs): 111.1%, change to 7
mysqlna has sufficient boost: 115.5%, change to 7
nk has sufficient boost: 132.3%, change to 7
o10glogon has sufficient boost: 126.5%, change to 7
o3logon has sufficient boost: 114.8%, change to 7
o5logon has sufficient boost: 152.1%, change to 7
openssl-enc has sufficient boost: 166.4%, change to 7
oracle has sufficient boost: 199.2%, change to 7
PBKDF2-HMAC-SHA256 has sufficient boost (different costs): 102.4%, change to 7
pfx has sufficient boost (different costs): 131.6%, change to 7
pgpwde has sufficient boost (different costs): 102.5%, change to 7
phpass has sufficient boost (different costs): 104.9%, change to 7
postgres has sufficient boost: 125.5%, change to 7
SSHA512 has sufficient boost: 186.5%, change to 7
securezip has sufficient boost: 117.0%, change to 7
solarwinds has sufficient boost: 110.4%, change to 7
OpenVMS has sufficient boost: 117.6%, change to 7
vmx has sufficient boost (different costs): 103.6%, change to 7
sha1crypt-opencl has sufficient boost (different costs): 130.3%, change to 7
KeePass-opencl has sufficient boost (different costs): 178.5%, change to 7
oldoffice-opencl has sufficient boost (different costs): 102.6%, change to 7
PBKDF2-HMAC-MD4-opencl has sufficient boost (different costs): 147.5%, change to 7
PBKDF2-HMAC-MD5-opencl has sufficient boost (different costs): 133.9%, change to 7
PBKDF2-HMAC-SHA1-opencl has sufficient boost (different costs): 115.6%, change to 7
rar-opencl has sufficient boost: 103.2%, change to 5
agilekeychain-opencl has sufficient boost (different costs): 102.3%, change to 7
axcrypt2-opencl has sufficient boost (different costs): 108.4%, change to 7
md5crypt-opencl has sufficient boost: 111.3%, change to 7
sha256crypt-opencl has sufficient boost (different costs): 102.1%, change to 7
dmg-opencl has sufficient boost (different costs): 145.6%, change to 7
ethereum-opencl has sufficient boost (different costs): 199.8%, change to 7
keyring-opencl has sufficient boost (different costs): 114.5%, change to 7
krb5pa-md5-opencl has sufficient boost: 104.5%, change to 7
lp-opencl has sufficient boost (different costs): 105.1%, change to 7
lpcli-opencl has sufficient boost (different costs): 102.6%, change to 7
ntlmv2-opencl has sufficient boost: 103.2%, change to 7
o5logon-opencl has sufficient boost: 299.8%, change to 7
ODF-opencl has sufficient boost (different costs): 133.2%, change to 7
office-opencl has sufficient boost (different costs): 103.9%, change to 7
PBKDF2-HMAC-SHA256-opencl has sufficient boost (different costs): 102.7%, change to 7
pfx-opencl has sufficient boost (different costs): 127.4%, change to 7
PHPass-opencl has sufficient boost (different costs): 107.7%, change to 7
vmx-opencl has sufficient boost (different costs): 106.6%, change to 7
andOTP has slower many-salts! 1078000 vs. 2856000, change to 0x107
argon2 has slower many-salts (different costs)! 1982 vs. 2000, change to 0x107
Bitcoin has slower many-salts (different costs)! 240 vs. 295, change to 0x107
gpg has slower many-salts (different costs)! 73408 vs. 74431, change to 0x107
monero has slower many-salts! 32.9 vs. 34.0, change to 0x107
saph has slower many-salts (different costs)! 285339 vs. 287680, change to 0x107
LastPass has slower many-salts (different costs)! 128909 vs. 130944, change to 0x107
SNMP has slower many-salts! 3257 vs. 6019, change to 0x107
sspr has slower many-salts (different costs)! 862 vs. 972, change to 0x107
OpenBSD-SoftRAID-opencl has slower many-salts! 132815 vs. 134663, change to 0x107
For Bitcoin, we could do:
Speed for cost 1 (iteration count) of 177864 and 258507
But there's little point in that. It's cleaner to report a speed for one of these costs. (...)
How about doing the |= 0x500
within bench.c instead of in the format, for any format that has different costs? That way we'd fix quite a good number of formats.
How about doing the
|= 0x500
within bench.c instead of in the format, for any format that has different costs? That way we'd fix quite a good number of formats.
I assume you mean "different costs" between the first two test vectors. I don't mind giving this change a try as long as we fully review its effect - should be easy to spot which formats stopped reporting "and" in their benchmark tunable costs list.
Yes that's what I meant. I had a look at doing so but it's not that trivial - I don't dare doing it until post release. I'll just change a bunch of formats instead.
Reopening for magnum not to forget to address the comments on d99065fd9a2db4d082bda9b423eef10fd40954d5.
Thanks. I've been stalled (yet again) by other stuff and am completely exhausted right now but my plan is to get a lot done tomorrow (really) after hibernation and recharging.
It's likely that in some formats we suppress the "Many salts" benchmarks inappropriately - formats where significant performance difference by salt count would often be seen.
It's also likely that in some formats we should suppress the "Many salts" benchmarks, but currently don't - formats where no significant performance difference by salt count is ever seen.
We should revisit this, and make sure we use appropriate settings.
Edit: note that for unsalted formats the suppression is automatic, and no action is needed for those. The review requested here is for salted formats only.