jschanck / ntru

Implementations of the NIST post-quantum cryptography process finalist NTRU.
https://ntru.org
Creative Commons Zero v1.0 Universal
41 stars 8 forks source link

avx2 fails to compile on a 64-bit-only system #9

Closed mouse07410 closed 4 years ago

mouse07410 commented 4 years ago

3.5 GHz Dual-Core Intel Core i7, macOS Catalina 10.15.6, Xcode-11.6, GCC-10, current master

Fails to compile on a 64-bit-only OS:

. . . . .
poly_rq_mul.s:3:8: error: invalid alignment value
.align 32
       ^
poly_rq_mul.s:225:1: error: unknown directive
.hidden poly_Rq_mul
^
poly_r2_mul.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_r2_mul.s:3:8: error: invalid alignment value
.align 32
       ^
poly_r2_mul.s:107:1: error: unknown directive
.hidden poly_R2_mul
^
poly_rq_to_s3.s:2:17: error: unexpected token in '.section' directive
.section .rodata
. . . . .

Complete build log: avx2-build.txt

jschanck commented 4 years ago

Thanks, this should be fixed by 3a00580.

mouse07410 commented 4 years ago

@jschanck thank you - part of the problems are fixed now. Unfortunately, some of the assembler directives still fail (unknown directive and unexpected token in '.section' directive).

gcc -O3 -fomit-frame-pointer -march=native -fPIC -fPIE -pie -Wall -Wextra -Wpedantic -o test/test_polymul fips202.c kem.c owcpa.c pack3.c packq.c poly.c poly_r2_inv.c sample.c sample_iid.c verify.c randombytes.c square_1_701_patience.s square_3_701_patience.s square_6_701_patience.s square_12_701_shufbytes.s square_15_701_shufbytes.s square_27_701_shufbytes.s square_42_701_shufbytes.s square_84_701_shufbytes.s square_168_701_shufbytes.s square_336_701_shufbytes.s poly_rq_mul.s poly_r2_mul.s poly_rq_to_s3.s vec32_sample_iid.s poly_mod_3_Phi_n.s poly_mod_q_Phi_n.s poly_s3_to_rq.s poly_s3_inv.s poly_rq_mul_x_minus_1.s test/test_polymul.c cpucycles.c
square_1_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_1_701_patience.s:6:1: error: unknown directive
.hidden square_1_701
^
square_3_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_3_701_patience.s:6:1: error: unknown directive
.hidden square_3_701
^
square_6_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_6_701_patience.s:6:1: error: unknown directive
.hidden square_6_701
^
square_12_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_12_701_shufbytes.s:6636:1: error: unknown directive
.hidden square_12_701
^
square_15_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_15_701_shufbytes.s:11774:1: error: unknown directive
.hidden square_15_701
^
square_27_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_27_701_shufbytes.s:4924:1: error: unknown directive
.hidden square_27_701
^
square_42_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_42_701_shufbytes.s:5480:1: error: unknown directive
.hidden square_42_701
^
square_84_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_84_701_shufbytes.s:4230:1: error: unknown directive
.hidden square_84_701
^
square_168_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_168_701_shufbytes.s:7284:1: error: unknown directive
.hidden square_168_701
^
square_336_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_336_701_shufbytes.s:6198:1: error: unknown directive
.hidden square_336_701
^
poly_rq_mul.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_mul.s:325:1: error: unknown directive
.hidden poly_Rq_mul
^
poly_r2_mul.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_r2_mul.s:107:1: error: unknown directive
.hidden poly_R2_mul
^
poly_rq_to_s3.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_to_s3.s:123:1: error: unknown directive
.hidden poly_Rq_to_S3
^
vec32_sample_iid.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
vec32_sample_iid.s:89:1: error: unknown directive
.hidden vec32_sample_iid
^
poly_mod_3_Phi_n.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_mod_3_Phi_n.s:56:1: error: unknown directive
.hidden poly_mod_3_Phi_n
^
poly_mod_q_Phi_n.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_mod_q_Phi_n.s:5:1: error: unknown directive
.hidden poly_mod_q_Phi_n
^
poly_s3_to_rq.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_s3_to_rq.s:293:1: error: unknown directive
.hidden poly_lift
^
poly_s3_inv.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_s3_inv.s:464:1: error: unknown directive
.hidden poly_S3_inv
^
poly_rq_mul_x_minus_1.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_mul_x_minus_1.s:89:1: error: unknown directive
.hidden poly_Rq_mul_x_minus_1
^
make: *** [test/test_polymul] Error 1

Update

Not sure if this is relevant: https://github.com/ClickHouse/ClickHouse/issues/8530

mouse07410 commented 4 years ago

The "offending" code that macOS and it's toolchain don't support is (an excerpt, not a complete list):

avx2-hps2048509/asmgen/rq_mul/poly_rq_mul.py
111:    p(".section .rodata")

avx2-hps2048509/asmgen/poly_mod_q_Phi_n.py
8:    p(".section .rodata")

avx2-hps2048509/asmgen/poly_rq_to_s3.py
11:    p(".section .rodata")

avx2-hps2048509/asmgen/poly_mod_3_Phi_n.py
9:    p(".section .rodata")

and

avx2-hps4096821/asmgen/poly_rq_to_s3.py
30:    p(".hidden {}poly_Rq_to_S3".format(NAMESPACE))

avx2-hps4096821/asmgen/poly_mod_q_Phi_n.py
12:    p(".hidden {}poly_mod_q_Phi_n".format(NAMESPACE))

avx2-hps4096821/asmgen/poly_mod_3_Phi_n.py
15:    p(".hidden {}poly_mod_3_Phi_n".format(NAMESPACE))

As I understand, Mac toolchains don't support these, but Linux does. So, if it would be possible to add some kind of guard to skip those if the OS is macOS (aka Darwin)?

jschanck commented 4 years ago

Maybe. I could drop the ".hidden" directives without consequence, but the ".section .rodata" need to be changed. And that might just surface some other issues later in the build. You might be able to build an ELF object file with the right assembler flags, and then pass the resulting object file through objconv. Sort of a kludge.

Let me know if you find a solution. I don't have a good macOS setup right now.

mouse07410 commented 4 years ago

I could drop the ".hidden" directives without consequence

Then, could you do so please?

but the ".section .rodata" need to be changed. And that might just surface some other issues later in the build

I'll experiment. I'm pretty sure there won't be any issues related to removal of this, because I've encountered this problem in other code - with the solution of just removing those .section rodata lines invariably working fine.

But if you insist, I'll see if I can find a way to guard these commands, so it's removed only on macOS.

Let me know if you find a solution.

My proposed solution that I'll try on macOS and Linux (CentOS-8 and Ubuntu-18) is just dropping this directive altogether. Possibly, dropping it only for macOS - but it would be much simpler if one can abolish t completely.

I don't have a good macOS setup right now.

I've a very good macOS setup, and would be happy to run the trials/tests for you.

Update

Another likely problem to surface is global names that macOS toolchain prefixes with underscore _, but other systems (Linux) usually leave as-is. A blunt way to solve this (which I used with other code) was just duplicating the .global into something like

.global funcname
.global _funcname

Would that be acceptable?

Update 2

You might be able to build an ELF object file

Alas, doesn't work - the problem is parsing, not generating. And my only tools are Clang with Xcode assembler, yasm, and nasm. None of them can parse these .s files, with Clang and yasm being the closest to succeeding - they only barf on .section rodata, .hidden, and .att_syntax.

mouse07410 commented 4 years ago

I've re-worked my fix of avx2-hrss701/.

Here's the current/latest patch that I've tested with Clang-10 and GCC-10, on macOS Catalina 10.15.6 with Xcode-11.6 and CentOS 8. avx2-hrss-mac.diff.txt

This is what the approach looks like (example of one file):

diff --git a/avx2-hrss701/asmgen/poly_mod_3_Phi_n.py b/avx2-hrss701/asmgen/poly_mod_3_Phi_n.py
index 9c7a5d3..aace969 100644
--- a/avx2-hrss701/asmgen/poly_mod_3_Phi_n.py
+++ b/avx2-hrss701/asmgen/poly_mod_3_Phi_n.py
@@ -1,20 +1,28 @@
 p = print

+from sys import platform
+
 from params import *
 from mod3 import mod3, mod3_masks

 if __name__ == '__main__':
     p(".data")
-    p(".section .rodata")
+    if platform != "darwin":
+        p(".section .rodata")
     p(".p2align 5")

     mod3_masks()

     p(".text")
-    p(".hidden {}poly_mod_3_Phi_n".format(NAMESPACE))
-    p(".global {}poly_mod_3_Phi_n".format(NAMESPACE))
-    p(".att_syntax prefix")
-
+    if platform == "darwin":
+        p(".global {}poly_mod_3_Phi_n".format(NAMESPACE))
+        p(".global _{}poly_mod_3_Phi_n".format(NAMESPACE))
+    else:
+        p(".hidden {}poly_mod_3_Phi_n".format(NAMESPACE))
+        p(".global {}poly_mod_3_Phi_n".format(NAMESPACE))
+        p(".att_syntax prefix")
+
+    p("_{}poly_mod_3_Phi_n:".format(NAMESPACE))
     p("{}poly_mod_3_Phi_n:".format(NAMESPACE))
     # rdi holds r

I would appreciate if you could apply it.

Unfortunately, there's about three times more work to make the other three AVX2 subdirectories Mac-compatible. Would you be able to do that?

jschanck commented 4 years ago

Thanks for looking into this. I just pushed a "macOS" branch which has a potential fix on it. Let me know if it works.

mouse07410 commented 4 years ago

Thanks for looking into this.

You're welcome.

I just pushed a "macOS" branch which has a potential fix on it. Let me know if it works.

Sorry to say, it doesn't. I don't think anything less than what my patch does, would fix the AVX2 code, because Mac toolchain cannot handle .section rodata. So, the code that generates .section rodata must be disabled for Mac, as my patch does. Also, once that compiles - there's likely to be an issue of C code expecting external functions to have _ prepended to their names, while the generated assembly code does not add it. This is likely to cause the link step to fail - but we're not there yet with the macOS branch. ;-) I see you've taken care of that - thanks!

At least I haven't seen (yet?) the complaints about .hidden XXX in this branch.

$ make -C avx2-hrss701 clean test
find . -name '*.pyc' -delete
find . -name '__pycache__' -delete
rm -f *.o
rm -f *.s
rm -f -r test/test_polymul
rm -f -r test/test_ntru
rm -f -r test/test_pack
rm -f -r test/speed
rm -f -r test/ram
rm -f -r test/encap
rm -f -r test/decap
rm -f -r test/keypair
rm -f -r test/speed_r2_inv
rm -f PQCgenKAT_kem
rm -f PQCkemKAT_*.req
rm -f PQCkemKAT_*.rsp
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --patience --callee 6 --namespace= --raw-name 1 \
     > square_1_701_patience.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --patience --callee 6 --namespace= --raw-name 3 \
     > square_3_701_patience.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --patience --callee 6 --namespace= --raw-name 6 \
     > square_6_701_patience.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 12 \
     > square_12_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 15 \
     > square_15_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 27 \
     > square_27_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 42 \
     > square_42_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 84 \
     > square_84_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 168 \
     > square_168_701_shufbytes.s
PYTHONPATH=bitpermutations \
     python3 bitpermutations/applications/squaring_mod_GF2N.py \
     --shufbytes --namespace= --raw-name 336 \
     > square_336_701_shufbytes.s
python3 asmgen/rq_mul/poly_rq_mul.py asmgen/rq_mul/K2_schoolbook_64x11.py asmgen/rq_mul/K2_K2_64x44.py > poly_rq_mul.s
python3 asmgen/poly_r2_mul.py > poly_r2_mul.s
python3 asmgen/poly_rq_to_s3.py > poly_rq_to_s3.s
python3 asmgen/vec32_sample_iid.py > vec32_sample_iid.s
python3 asmgen/poly_mod_3_Phi_n.py > poly_mod_3_Phi_n.s
python3 asmgen/poly_mod_q_Phi_n.py > poly_mod_q_Phi_n.s
python3 asmgen/poly_s3_to_rq.py > poly_s3_to_rq.s
python3 asmgen/poly_s3_inv.py > poly_s3_inv.s
python3 asmgen/poly_rq_mul_x_minus_1.py > poly_rq_mul_x_minus_1.s
/usr/bin/cc -O3 -fomit-frame-pointer -march=native -fPIC -fPIE -pie -Wall -Wextra -Wpedantic -o test/test_polymul fips202.c kem.c owcpa.c pack3.c packq.c poly.c poly_r2_inv.c sample.c sample_iid.c verify.c randombytes.c square_1_701_patience.s square_3_701_patience.s square_6_701_patience.s square_12_701_shufbytes.s square_15_701_shufbytes.s square_27_701_shufbytes.s square_42_701_shufbytes.s square_84_701_shufbytes.s square_168_701_shufbytes.s square_336_701_shufbytes.s poly_rq_mul.s poly_r2_mul.s poly_rq_to_s3.s vec32_sample_iid.s poly_mod_3_Phi_n.s poly_mod_q_Phi_n.s poly_s3_to_rq.s poly_s3_inv.s poly_rq_mul_x_minus_1.s test/test_polymul.c cpucycles.c
clang: warning: argument unused during compilation: '-pie' [-Wunused-command-line-argument]
square_1_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_3_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_6_701_patience.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_12_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_15_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_27_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_42_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_84_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_168_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
square_336_701_shufbytes.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_mul.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_r2_mul.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_to_s3.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
vec32_sample_iid.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_mod_3_Phi_n.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_mod_q_Phi_n.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_s3_to_rq.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_s3_inv.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
poly_rq_mul_x_minus_1.s:2:17: error: unexpected token in '.section' directive
.section .rodata
                ^
cpucycles.c:6:3: warning: extension used [-Wlanguage-extension-token]
  asm volatile(".byte 15;.byte 49;shlq $32,%%rdx;orq %%rdx,%%rax"
  ^
1 warning generated.
make: *** [test/test_polymul] Error 1

$ git branch
* macOS
  master

Unrelated - what do you think about this change (I'd welcome it!):

diff --git a/avx2-hrss701/Makefile b/avx2-hrss701/Makefile
index acfeb76..cde95b5 100644
--- a/avx2-hrss701/Makefile
+++ b/avx2-hrss701/Makefile
@@ -1,4 +1,4 @@
-CC = /usr/bin/cc
+CC ?= /usr/bin/cc
 CFLAGS = -O3 -fomit-frame-pointer -march=native -fPIC -fPIE -pie
 CFLAGS += -Wall -Wextra -Wpedantic

Update

I personally don't think that it is necessary to mark section rodata. It is certainly more secure that way - but the code should work regardless. So, if you don't like putting if platform == "darwin": in the .py files - perhaps you can just eliminate those p(".section rodata") commands altogether...?

mouse07410 commented 4 years ago

Here's a patch to make avx2-hrss701 compile and run correctly: avx2-macos.diff.txt

This excerpt is similar to the previous patch, but simpler - as you did a big part of the work already:

diff --git a/avx2-hrss701/bitpermutations/bitpermutations/printing.py b/avx2-hrss701/bitpermutations/bitpermutations/printing.py
index 1702fd6..0c3c14c 100644
--- a/avx2-hrss701/bitpermutations/bitpermutations/printing.py
+++ b/avx2-hrss701/bitpermutations/bitpermutations/printing.py
@@ -4,6 +4,7 @@ import bitpermutations.data as data
 import bitpermutations.utils as utils
 from .utils import reg_to_memfunc

+from sys import platform

 def print_memfunc(f, in_size, out_size, per_reg=256, initialize=False):
     """Wraps a function that operates on registers in .data and .text sections,
@@ -22,7 +23,8 @@ def print_memfunc(f, in_size, out_size, per_reg=256, initialize=False):
     f(out_data, in_data)

     print(".data")
-    print(".section .rodata")
+    if platform != "darwin":
+        print(".section .rodata")
     print(".p2align 5")
     for mask in data.DATASECTION:
         print(mask.data())
jschanck commented 4 years ago

Ah, really thought I'd removed the .section .rodata's... just doing this with a chain of sed commands. Should be fixed now.

mouse07410 commented 4 years ago

Perfect! Your macOS branch builds and runs fine on MacOS 10.15.6 with the latest stable Xcode.

Thank you!

jschanck commented 4 years ago

Great! Did you check all 4 parameter sets, or just hrss701?

mouse07410 commented 4 years ago

I checked hrss701 and hps4096821. I assume you did the same for the other two hpsXXXXXXX. ;-)

jschanck commented 4 years ago

I checked them on Linux---as I said, I don't have a macOS system to test on right now.

If you check the other two parameter sets, I can close this issue and merge the branch.

mouse07410 commented 4 years ago

If you check the other two parameter sets, I can close this issue and merge the branch.

Yep, tested all the four. All are good to go!

Getting some warnings - not sure what to do about them, and how bad they are in general. My "normal" approach is to get rid of all the compiler warnings, but I acknowledge that it doesn't always work...

Building avx2-hps4096877 (and other avx2-hpsXXXXXXX):

$ CC=gcc make -C avx2-hps4096821/ clean all
. . . . .
/usr/bin/cc -O3 -fomit-frame-pointer -march=native -fPIC -fPIE -pie -Wall -Wextra -Wpedantic -o test/speed_r2_inv fips202.c crypto_sort_int32.c djbsort/sort.c kem.c owcpa.c pack3.c packq.c poly.c poly_lift.c poly_r2_inv.c poly_s3_inv.c sample.c sample_iid.c verify.c randombytes.c square_1_821_patience.s square_3_821_patience.s square_6_821_patience.s square_12_821_shufbytes.s square_24_821_shufbytes.s square_51_821_shufbytes.s square_102_821_shufbytes.s square_204_821_shufbytes.s square_408_821_shufbytes.s poly_rq_mul.s poly_r2_mul.s poly_rq_to_s3.s vec32_sample_iid.s poly_mod_3_Phi_n.s poly_mod_q_Phi_n.s cpucycles.c test/speed_r2_inv.c
clang: warning: argument unused during compilation: '-pie' [-Wunused-command-line-argument]
djbsort/sort.c:25:7: warning: extension used [-Wlanguage-extension-token]
      int32_MINMAX(*x,*y);
      ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:183:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x1,x0);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:184:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x3,x2);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:185:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x2,x0);
    ^
. . . . .
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:1176:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x[j],x[j+1]);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:1177:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x[j+2],x[j+3]);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:1181:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x[j],x[j+2]);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
djbsort/sort.c:1183:5: warning: extension used [-Wlanguage-extension-token]
    int32_MINMAX(x[j],x[j+1]);
    ^
djbsort/int32_minmax_x86.c:4:3: note: expanded from macro 'int32_MINMAX'
  asm( \
  ^
66 warnings generated.
cpucycles.c:6:3: warning: extension used [-Wlanguage-extension-token]
  asm volatile(".byte 15;.byte 49;shlq $32,%%rdx;orq %%rdx,%%rax"
  ^
1 warning generated.

Again, on an unrelated issue - I'd like this change (pretty much throughout this repo), if possible:

diff --git a/avx2-hrss701/Makefile b/avx2-hrss701/Makefile
index acfeb76..cde95b5 100644
--- a/avx2-hrss701/Makefile
+++ b/avx2-hrss701/Makefile
@@ -1,4 +1,4 @@
-CC = /usr/bin/cc
+CC ?= /usr/bin/cc
 CFLAGS = -O3 -fomit-frame-pointer -march=native -fPIC -fPIE -pie
 CFLAGS += -Wall -Wextra -Wpedantic

It's nice when I don't have to edit Makefile to build with one compiler, or another. Same would apply to CFLAGS, though I agree that many users may have their CFLAGS set to something that this repo might not like.

jschanck commented 4 years ago

Great! Thanks for your help. I've made the CC ?= ... change too.

mouse07410 commented 4 years ago

've made the CC ?= ... change too.

One more thing: could I ask you to apply that change to all the Makefiles, please? Thanks!

jschanck commented 4 years ago

Can you point me to the specific file that you're having trouble with? I changed all 8 of the files named "Makefile".

mouse07410 commented 4 years ago

Darn... I did git pull several times, and it did not pick that change. Let me re-clone and report.

mouse07410 commented 4 years ago

Funny. I removed the repo, re-cloned - and everything's good! No, I don't have any explanation why git pull did not do what it was supposed to.

Thanks again.