Closed roconnor-blockstream closed 4 months ago
AFAICS, SHA NI is not enabled by default if it is available. One has to call the
simplicity_cpu_optimize_not_thread_safe
function to explicitly opt into the optimization.
@uncomputable I was under the impression that static libraries cannot do before main initialization. However after some investigation, I see that it is sort of possible (in gcc / clang on linux at least).
First I need to add __attribute((constructor))
to the simplicity_cpu_optimize_not_thread_safe
.
Secondly we have to overcome the problem that when making a static library, unreferenced functions are removed, and so, even though we have marked this function to be called during initialization, without any actual references to it, the function will just be removed during linking. See https://stackoverflow.com/a/1804618.
One answer is to build cpu.o
separately from the static library and add cpu.o
separately when linking. As noted in the stack overflow answer above, object files are always fully included into the main program during linking.
That said, you can see that in this solution, technically we are not doing initialization in the static library. We had to add a separate object file.
Secondly we have to overcome the problem that when making a static library, unreferenced functions are removed, and so, even though we have marked this function to be called during initialization, without any actual references to it, the function will just be removed during linking.
My read of this is that there are two issues:
So this solution seems fragile.
- But sorta-separately, because you do not call anything from the .o file, the compiler is still not required to ever call your initialization function. Though probably any real compiler will do so.
Even if you don't call anything from the .o
file, the compiler does call the initialization function. This in fact what will happen in my upcoming proposal. This is because the __attribute__((constructor))
places the function into the .init.array
section of the object file:
$ readelf -a result/lib/cpu.o
…
Relocation section '.rela.init_array' at offset 0x9f0 contains 1 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 001400000001 R_X86_64_64 0000000000000000 simplicity_cpu_optimiz + 0
…
I don't know what the ELF rules are, but I imagine the linker has no choice but to include the init_array
stuff.
Now I need to figure out how to do this in automake. :/
Now I need to figure out how to do this in automake. :/
I've spend a day looking into this and the trick I do in this Makefile, I cannot figure out how to replicate in Automake (my research hinted it would be possible CMake).
So I'm looking into another design that would merge what is currently "cpu.c" and "sha256.c" into a single file (a single translation unit). This works because linking static libraries works on the granularity of object files: if any value in an object file is referenced, then the entire object file is linked in. No fancy makefile tricks needed.
This new version incorporates cpu.c
, now called sha256_x86.inc
into sha256.c
. Now the initialization function is included whenever any function from sha256.h
is required.
Relax timing limit somewhat, and check for sha_ni CPU support.
Timing in testing require sha_ni CPU support.