rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.81k stars 174 forks source link

PIC register clobbered by ‘%ebx’ in ‘asm’ or illegal instruction #245

Closed tansy closed 1 year ago

tansy commented 1 year ago

In SpeedTest.cpp functions `timehash()', `timehash_small()', `HashMapSpeedTest()' use `timer_start()', `timer_end()' which on 32-bit linux use assembler hacked substitute os `rdtsc()' (defined in Platform.h). GCC fails to compile it with following error:

Platform.h: In function ‘int64_t timehash(pfHash, const void*, int, int)
’:
Platform.h:179:39: error: PIC register clobbered by ‘%ebx’ in ‘asm’
        "%eax", "%ebx", "%ecx", "%edx");
                                       ^
Platform.h:204:39: error: PIC register clobbered by ‘%ebx’ in ‘asm’
        "%eax", "%ebx", "%ecx", "%edx");
                                       ^

(Full error log)

which is good as you know there is something wrong with it. Clang doesn't even warn about that but generated program crashes with 'illegal instruction'.

When this calls are replaced with rdtsc() it works.

--- SpeedTest.cpp~  2022-11-03 10:04:58.000000000 +0100
+++ SpeedTest.cpp   2022-11-10 00:57:53.000000000 +0100
@@ -169,11 +169,11 @@ NEVER_INLINE int64_t timehash ( pfHash h
   volatile int64_t begin, end;
   uint32_t temp[16];

-  begin = timer_start();
+  begin = rdtsc();

   hash(key,len,seed,temp);

-  end = timer_end();
+  end = rdtsc();

   return end - begin;
 }

(All changes are here in case you wanted to have a look) It's unnecessary to use assembler version that is wrong when there is rdtsc() available in compiler (gcc-4.7+, clang-3.8+, mingw-gcc-4.6+; afaic). If not then at least it could be corrected. I don't undertake this as inline assembler is difficult and tricky, as here. Here some more info I used to understand it, maybe you find it useful.

rurban commented 1 year ago

Certainly not. You didn't ever to look at the code and it's link to ia-32-ia-64-benchmark-code-execution-paper.pdf. rdtsc is not accurate at all, timings were far off with that alone.

%ebx is not needed, so it can be removed.

tansy commented 1 year ago

You didn't ever to look at the code and it's link to ia-32-ia-64-benchmark-code-execution-paper.pdf

I did, but I saw an inline assembler which I don't get, and found that gcc does something similar in their rdtsc(). That's what I meant.

You can always fix if you understand that.