Open octopus-prime opened 2 years ago
ok - some problems with the ++ stuff?! so trying pure c
/opt/opencilk/bin/clang -fopencilk -O3 -o fib fib.c
time ./fib 50
fib(50) = 12586269025
real 0m24,681s
user 3m7,465s
sys 0m5,688s
and - wtf
time CILK_NWORKERS=2 ./fib 50
fib(50) = 12586269025
real 1m41,398s
user 3m22,750s
sys 0m0,021s
now more/all cpus are used but now the execution is even SLOWER
Thank you for trying out OpenCilk and for reaching out about this issue.
I'll address your second question first, because that answer is more straightforward. By default — e.g., if CILK_NWORKERS
is not specified — OpenCilk will use as many workers as there are cores (more precisely, hardware threads) on the system. I'm guessing your system has more than 2 cores, so in your second message, the running time you observe from setting CILK_NWORKERS=2
is slower than that when CILK_NWORKERS
is left unset.
As for your first question, it turns out we discovered this issue recently and pushed a fix for it to the current development branch of the compiler, dev/14.x
. In the case of Cilk/C++ fib
, a compiler optimization was effectively serializing the parallel constructs, causing the final executable to run as serial code. We have so far only observed the issue to affect Cilk/C++ fib
— because of how aggressively the compiler can optimize fib
— and as I mentioned, we have pushed a fix for the issue to the current development branch. If you would like to try out that fix, you can follow the instructions for building OpenCilk from source (https://www.opencilk.org/doc/users-guide/build-opencilk-from-source/#obtaining-the-opencilk-source-code-(detailed-instructions)), downloading the dev/14.x
branch of opencilk-project
instead of the opencilk/v2.0
tag.
It's also important to note that parallel fib
— whether written in Cilk or another task-parallel programming platform — generally has high scheduling overheads. (In other words, the total computational work added to the program to enable parallel execution of fib
is large relative to the computation of fib
itself.) This scheduling overhead can be overcome with a sufficient number of processor cores. Hence, a performance comparison between serial and parallel versions of fib
on small core counts will often show that serial fib
achieves better performance. Because of the aforementioned optimization bug with the Cilk/C++ fib
version, comparing the performance results from time ./fib 50
in your first and second messages effectively compares serial fib
to Cilk-parallel fib
run on (I'm guessing) 8 cores.
Please let us know if you have more questions or run into any other issues.
Not sure what to do...
I tried https://github.com/OpenCilk/infrastructure/blob/release/INSTALLING.md
git clone -b opencilk/v2.0.1 https://github.com/OpenCilk/infrastructure
infrastructure/tools/get $(pwd)/opencilk
infrastructure/tools/build $(pwd)/opencilk $(pwd)/build
1 hour build time (8 cores)
build/bin/clang++ -O3 -fopencilk fib.cpp
./a.out 50
Only 1 cpu is used again
In this case, the get
script is cloning the OpenCilk source repositories at the opencilk/v2.0.1
tag. I recommend following the detailed instructions (https://github.com/OpenCilk/infrastructure/blob/release/INSTALLING.md#obtaining-the-opencilk-source-code-detailed-instructions) so you can explicitly download the dev/14.x
branch of opencilk-project
, instead of the opencilk/v2.0.1
tag.
We have a new release of OpenCilk that includes a fix for the issue you ran into. You can use this shell archive to install the new version of OpenCilk: https://github.com/OpenCilk/opencilk-project/releases/download/opencilk%2Fv2.1/opencilk-2.1.0-x86_64-linux-gnu-ubuntu-22.04.sh.
I tried your tutorial on Ubuntu 22.10
Download OpenCilk-2.0.0-x86_64-Linux-Ubuntu-22.04.sh Download fib.c as fib.cpp
No faster/parallel execution. While running fib - top shows usage of ONE cpu only