Open forthrin opened 4 months ago
There are several (non-)issues here.
-v=5
. You can also force usage of previously tuned LWS/GWS to skip the auto-tuning, e.g. by LWS=64 GWS=524288 john ...
(setting those as environment variables via the Unix shell; note that it's easy copy-paste of what had been output before) or by using the --lws
and --gws
command-line options. You can also skip the self-test with --skip-self-test
, but we generally recommend against skipping it.Thanks for reaching out!
Consecutive plain john --format=dmg-opencl --test
runs are consistently slow (eg. nothing seems automatically cached/reused between runs, if that was supposed to happen). Explicitly setting LWS and GWS finished the test in 12s.
Build was the plain recommended ./configure && make -s clean && make -sj4
. Do tell how to provide the necessary information to pinpoint why OpenMP is not enabled by default as expected.
checking for gcc option to support OpenMP... unsupported
AES-NI support ..................................... no
Cross compiling .................................... no
OpenMP support ..................................... no
librexgen (regex mode, see doc/README.librexgen) ... no
ZTEX USB-FPGA module 1.15y support ................. no
$ which gcc
/usr/bin/gcc
$ gcc --version
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
PS! Adding a stopwatch timestamp at the beginning of each debug line would make it easier to see/share performance!
Consecutive plain
john --format=dmg-opencl --test
runs are consistently slow (eg. nothing seems automatically cached/reused between runs, if that was supposed to happen).
That's wrong conclusion - like I wrote, only 1 out of 3 things is supposed to be cached, so the runs would remain slow overall, yet some caching would occur. That's by design.
Since the auto-tuning is usually the slowest part, no wonder setting explicit LWS/GWS removes most of the delay for you.
Do tell how to provide the necessary information to pinpoint why OpenMP is not enabled by default as expected.
OK, apparently Apple still provides only crippled clang in Xcode. This has some answers: https://stackoverflow.com/questions/71061894/how-to-install-openmp-on-mac-m1
We also have binary builds in https://github.com/openwall/john-packages/releases - I think these are with OpenMP.
I don't use macOS myself.
$ ./omptest # works even with /usr/bin/gcc
Hello World... from thread = 0
<snip>
Hello World... from thread = 7
$ brew info libomp
==> libomp: stable 18.1.8 (bottled) [keg-only]
$ echo $LDFLAGS
-L/opt/homebrew/opt/libomp/lib
$ echo $CPPFLAGS
-I/opt/homebrew/opt/libomp/include
$ brew info gcc
==> gcc: stable 14.1.0 (bottled), HEAD
$ echo $PATH
/opt/homebrew/Cellar/gcc/14.1.0_1/bin:<snip> # no "gcc", only "gcc-14" ++
$ ./configure
OpenMP support ..................................... still no
@solardiz: If you're on another OS and out of ideas, hope someone on macOS can chip in, for optimal performance.
$ echo $PATH /opt/homebrew/Cellar/gcc/14.1.0_1/bin:
# no "gcc", only "gcc-14" ++
As some people on that StackOverflow wrote, you may need to use e.g. gcc-14
explicitly. Try e.g. ./configure CC=gcc-14
. Alternatively, apparently it should also be possible to use libomp coming from Brew along with clang.
@claudioandre-br Maybe you want to comment on how we're doing our CI builds for macOS - with OpenMP, right?
CC=gcc-14
got somewhere! How to get a true "yes", or is "yes, but" as good as it gets?
OpenMP support ..................................... yes (not for fast formats)
You can force it to a plain yes
with --enable-openmp-for-fast-formats
, but we recommend that you don't do that.
@claudioandre-br Maybe you want to comment on how we're doing our CI builds for macOS - with OpenMP, right?
Yes, OpenMP on"default" clang. The rolling-2404
is:
Version: 1.9.0-jumbo-1+bleeding-f9fedd238b 2024-04-01 13:35:37 +0200
Build: darwin23.5.0 64-bit arm ASIMD AC OMP OPENCL
SIMD: ASIMD, interleaving: MD4:2 MD5:2 SHA1:1 SHA256:1 SHA512:1
OMP fallback binary: john-arm64
$JOHN is ../run/
[...]
Max. Markov mode password length: 30
clang version: 15.0.0 (clang-1500.3.9.4) (gcc 4.2.1 compatibility)
OpenCL headers version: 1.2
Crypto library: OpenSSL
OpenSSL library version: 030300010
OpenSSL 3.3.1 4 Jun 2024
GMP library version: 6.3.0
[...]
Build on:
ProductVersion: 14.5
BuildVersion: 23F79
Homebrew 4.3.9
Xcode_15.4
doc/INSTALL
recommends gcc (passing the CC
value for .configure
). That's what magnum used to use.
@claudioandre-br In other words, you're saying we get OpenMP with the exact same clang version that @forthrin also has, but does not get OpenMP with? Are we perhaps installing something from Homebrew to get this to work?
BTW, we have doc/INSTALL*
files for several systems, but not for macOS. Maybe we should add one.
@claudioandre-br: Ah! I see this is actually thoroughly described at the bottom of DOC/INSTALL. Should have read the whole thing. Maybe configure
should say:
OpenMP ... no (See DOC/INSTALL "Optimal build on macOS")
(Actually still says "OS X")
Happy to say john
managed to crack a forgotten password which saved a ton of work!
Closing up:
--enable-openmp-for-fast-formats
discouraged? (Which are the "fast formats"?)(0-2 exclamation marks) (random word from list) (0 or 1 spaces) (number 0-9999) (0 or 1 spaces) (random word from list) (0-2 exclamation marks)
... or is it generally recommended to make such with a scripting language in advance? (Had a look at the mask and rules docs, but couldn't really figure out eg. combining two words.)
In other words, you're saying we get OpenMP with the exact same clang version that @forthrin also has, but does not get OpenMP with?
Yes. See the -Xclang -fopenmp
at https://github.com/openwall/john-packages/blob/main/scripts/ci_controller.sh#L97.
There is some related magic and Google-sama might have an explanation on the matter. By the way: the order of the arguments is important.
See the
-Xclang -fopenmp
at https://github.com/openwall/john-packages/blob/main/scripts/ci_controller.sh#L97.
Should we possibly get this into the main project here, perhaps as a change to configure.ac
? Would you test and contribute?
When linking to specific line numbers, let's also link to specific revisions - GitHub substitutes it when you press y
. Anyway, I'll quote the relevant excerpt below this time.
if [[ "$TARGET_ARCH" == *"macOS"* ]]; then
SYSTEM_WIDE=''
REGULAR="$SYSTEM_WIDE $ASAN $BUILD_OPTS"
NO_OPENMP="--disable-openmp $SYSTEM_WIDE $ASAN $BUILD_OPTS"
#Libraries and Includes
if [[ "$TARGET_ARCH" == *"macOS X86"* ]]; then
MAC_LOCAL_PATH="usr/local/opt"
else
MAC_LOCAL_PATH="opt/homebrew/opt"
fi
LDFLAGS_ssl="-L/$MAC_LOCAL_PATH/openssl/lib"
LDFLAGS_gmp="-L/$MAC_LOCAL_PATH/gmp/lib"
LDFLAGS_omp="-L/$MAC_LOCAL_PATH/libomp/lib -lomp"
CFLAGS_ssl="-I/$MAC_LOCAL_PATH/openssl/include"
CFLAGS_gmp="-I/$MAC_LOCAL_PATH/gmp/include"
CFLAGS_omp="-I/$MAC_LOCAL_PATH/libomp/include"
if [[ $TARGET_ARCH == *"macOS ARM"* ]]; then
brew link openssl --force
fi
if [[ $TARGET_ARCH == *"macOS X86"* ]]; then
do_configure "$NO_OPENMP" --enable-simd=avx && do_build ../run/john-avx
do_configure "$REGULAR" --enable-simd=avx LDFLAGS="$LDFLAGS_omp" CPPFLAGS="-Xclang -fopenmp $CFLAGS_omp -DOMP_FALLBACK_BINARY=\"\\\"john-avx\\\"\" " && do_build ../run/john-avx-omp
do_configure "$NO_OPENMP" --enable-simd=avx2 && do_build ../run/john-avx2
do_configure "$REGULAR" --enable-simd=avx2 LDFLAGS="$LDFLAGS_omp" CPPFLAGS="-Xclang -fopenmp $CFLAGS_omp -DOMP_FALLBACK_BINARY=\"\\\"john-avx2\\\"\" -DCPU_FALLBACK_BINARY=\"\\\"john-avx-omp\\\"\" " && do_build ../run/john-avx2-omp
BINARY="john-avx2-omp"
else
do_configure "$NO_OPENMP" LDFLAGS="$LDFLAGS_ssl $LDFLAGS_gmp" CPPFLAGS="$CFLAGS_ssl $CFLAGS_gmp" && do_build "../run/john-$arch"
do_configure "$REGULAR" LDFLAGS="$LDFLAGS_ssl $LDFLAGS_gmp $LDFLAGS_omp" CPPFLAGS="-Xclang -fopenmp $CFLAGS_ssl $CFLAGS_gmp $CFLAGS_omp -DOMP_FALLBACK_BINARY=\"\\\"john-$arch\\\"\" " && do_build ../run/john-omp
BINARY="john-omp"
fi
do_release "No" "Yes" $BINARY # --system-wide, --support-opencl, --binary-name
fi
the default benchmark being for a mix of two different cost settings is something we could want to fix, standardizing on one cost.
Maybe what we have now isn't that bad - it's same iteration count. I don't recall what effect on speeds the version has for the same iteration count. Looking at the test vectors we currently have, there aren't any two that look the same anyway - they all have data of different lengths.
Speed for cost 1 (iteration count) of 1000, cost 2 (version) of 2 and 1
1. Why is `--enable-openmp-for-fast-formats` discouraged? (Which are the "fast formats"?)
Those are implementations of hashes that are too fast for efficient scaling with our current use of OpenMP. We recommend usage of --fork
on them instead of OpenMP.
2. Can John make word lists such as:
(0-2 exclamation marks) (random word from list) (0 or 1 spaces) (number 0-9999) (0 or 1 spaces) (random word from list) (0-2 exclamation marks)
... or is it generally recommended to make such with a scripting language in advance?
This varies. I'd normally use John itself, but for complicated cases such as yours it may take non-trivial configuration or/and multiple invocations of it. Here's a start, but then you need to write rule sets and likely use two invocations, piping intermediate list between them:
./john -w=w --external=combinator --rules-stack=': /[ ] Ap" [0-9][0-9][0-9][0-9]"' --stdout
I'd have greater motivation to spend time on a more complete answer if we were discussing this on the john-users list, but here it's just not worth my time.
@solardiz: Good you're focusing your valuable time. Happy that OP segued into worthwhile compilation issues. Ping me for testing. Thanks for your help and keep up the great work!
PS! If the mailing list was moved to "Discussions" at some point that would be convenient!
Using OpenCL formats is much faster than its CPU counterparts, but they seem to have something of a one minute startup penalty. (Accordingly, when running an actual crack, nothing happens until after about one minute.) Is this: