Open fadedbee opened 3 years ago
The implementation chosen is the most recent instruction set supported on the current machine. Currently that's mainly Intel instruction sets, so you get:
avx512
if AVX-512 is supportedavx2
if AVX2 is supportedsse41
if SSE4.1 is supportedsse2
if SSE2 is supportedportable
You can see what your CPU supports with e.g. cat /proc/cpuinfo
on Linux. The only other thing that can affect dispatching is if you disabled some instruction sets at build time, like with -DBLAKE3_NO_AVX512
.
If you want to see this in action, I'd suggest putting in some print statements in the blake3_hash_many
and then hashing a 100 KB input. (You could also log from blake3_compress_in_place
or blake3_compress_xof
, but AVX2 doesn't show up in those functions.)
As far as a public, stable debugging API, what would be the expected use case there?
Thanks for your response.
Can you leave this issue open, as I'm getting some very odd results? Mostly the SSE41 code is run, occasionally the SSE2 code. My CPU reports "avx":
$ cat /proc/cpuinfo | grep sse | head -1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
More debug to follow...
When you say "Mostly the SSE41 code is run", could you clarify what that means? Where are you putting your print statements, and what commands are you running to test the code?
I've embedded the various BLAKE3 C source files in an existing autotools project. (I had to add AM_PROG_AS to configure.ac.)
How can I confirm that the blake3_dispatch is working correctly? Ideally I'd like to set -DDEBUG and see which implementation it was choosing. Or perhaps a new function in blake3_dispatch to report which specialised code was used?
Assuming that debug of this sort doesn't currently exist, would this be a welcome addition upstream?