Closed floyd-fuh closed 9 years ago
Thanks for reporting this. Indeed this code is not compatible with ARM processors. For compatibility we can disable high-precision timing in ARM, and/or use the SysTick feature in most recent ARMs (see http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka8713.html, https://github.com/google/benchmark/blob/master/src/cycleclock.h).
Just tried something like that. Seems like it could be a little pain to make sure it works with all the ARM versions/precompiler macros. My short attempt was this one ripped out of gperftools with the newest git version of argon2, where argon2-test.c was renamed to main.c. Obviously this is highly flawed (we don't even have a return value for all cases):
#elif __arm__ // TODO: V6 is the earliest arch that has a standard cyclecount, will not work for pre-V6
uint32_t pmccntr;
uint32_t pmuseren;
uint32_t pmcntenset;
// Read the user mode perf monitor counter access permissions.
__asm__ __volatile__ ("mrc p15, 0, %0, c9, c14, 0" : "=r" (pmuseren));
if (pmuseren & 1) { // Allows reading perfmon counters for user mode code.
__asm__ __volatile__ ("mrc p15, 0, %0, c9, c12, 1" : "=r" (pmcntenset));
if (pmcntenset & 0x80000000ul) { // Is it counting?
__asm__ __volatile__ ("mrc p15, 0, %0, c9, c13, 0" : "=r" (pmccntr));
// The counter is set up to count every 64th cycle
return pmccntr << 6;
}
}
However the benchmarking seems to produce wrong results:
$ ./argon2-plain/argon2 b Argon2d 1 pass(es) 1 Mbytes 1 threads: 4190208.00 cpb 4190208.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 1 threads: 4307550208.00 cpb 4307550208.00 Mcycles 0.0217 seconds
Argon2d 1 pass(es) 1 Mbytes 2 threads: 4194304.00 cpb 4194304.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 2 threads: 4307550208.00 cpb 4307550208.00 Mcycles 0.0224 seconds
Argon2d 1 pass(es) 1 Mbytes 4 threads: 4194304.00 cpb 4194304.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 4 threads: 4307550208.00 cpb 4307550208.00 Mcycles 0.0146 seconds
Argon2d 1 pass(es) 1 Mbytes 6 threads: 4194304.00 cpb 4194304.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 6 threads: 4240441344.00 cpb 4240441344.00 Mcycles 0.0168 seconds
Argon2d 1 pass(es) 1 Mbytes 8 threads: 4194304.00 cpb 4194304.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 8 threads: 4307550208.00 cpb 4307550208.00 Mcycles 0.0155 seconds
Argon2d 1 pass(es) 1 Mbytes 16 threads: 4194304.00 cpb 4194304.00 Mcycles Argon2i 1 pass(es) 1 Mbytes 16 threads: 4307550208.00 cpb 4307550208.00 Mcycles 0.0255 seconds
I suppose I let the maintainers figure this out, but I'm happy to help further testing on ARM.
Btw. it doesn't compile on 32bit either (compiled with gcc, regular "make"):
gcc -std=c99 -pthread -O3 -Wall -g src/argon2.c src/core.c src/kat.c src/blake2/blake2b-ref.c src/ref.c src/main.c -Isrc -o argon2
src/main.c: In function 'run':
src/main.c:204:9: warning: implicit declaration of function 'strdup' [-Wimplicit-function-declaration]
in = ( uint8_t * )strdup( pwd );
^
src/main.c: In function 'benchmark':
src/main.c:57:5: error: inconsistent operand constraints in an 'asm'
__asm__ __volatile__ ( "rdtsc" : "=a" ( rax ), "=d" ( rdx ) : : );
^
src/main.c:57:5: error: inconsistent operand constraints in an 'asm'
__asm__ __volatile__ ( "rdtsc" : "=a" ( rax ), "=d" ( rdx ) : : );
^
src/main.c:57:5: error: inconsistent operand constraints in an 'asm'
__asm__ __volatile__ ( "rdtsc" : "=a" ( rax ), "=d" ( rdx ) : : );
^
make: *** [argon2] Error 1
$ uname -a
Linux user-xubuntu-32-vbox 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:31:23 UTC 2014 i686 i686 i686 GNU/Linux
Thanks for the update! For the latest message about 32-bit, what's the CPU? (cat /proc/cpuinfo?)
@sneves just pushed a change that should solve the issue on 32-bit x86.
Confirmed, compiles on 32bit now. Leaving the bug open for ARM.
Things should build OK now on ARM. Exceptions are:
make OPT=TRUE
does not build on ARM, since it is x86-specific;make bench
also does not work by the same reason it didn't before---no portable timestamp counter available.I can use Linux's perf_event
API with PERF_COUNT_HW_CPU_CYCLES
to read cycles, but that will be (obviously) Linux specific.
System:
A configure script would be nice. The code doesn't honor $CC.