status-im / nim-secp256k1

A wrapper for libsecp256k1
Apache License 2.0
10 stars 5 forks source link

bug: Compilation error on new AMD platform #56

Closed NagyZoltanPeter closed 4 months ago

NagyZoltanPeter commented 7 months ago

While building nim-waku we faced an issue compiling secp256k1 on a new AMD Ryzen 7 PRO 7840U system (on UBUNTU 22.04 LTS) https://github.com/waku-org/nwaku/issues/2560

originally reported from outside

I would like to propose a quick workaround until it maybe fixed in the upstream library, otherwise please advise. patch: amd-asm-compilation-fix

mratsim commented 7 months ago

Can you check with Clang?

Also can you ensure that LTO is not used for crypto: https://github.com/status-im/nimbus-eth2/blob/dc19b08/config.nims#L220-L221, as if there is inlining, the compiler might run out of registers.

You might also want to ensure you have enough register by passing -fomit-frame-pointer to secp256k1 building. Cryptographic libraries need all the registers they can get, and not passing this will incure a large performance penalty:

https://www.phoronix.com/review/fedora-frame-pointer/2

image

NagyZoltanPeter commented 7 months ago

Can you check with Clang?

Also can you ensure that LTO is not used for crypto: https://github.com/status-im/nimbus-eth2/blob/dc19b08/config.nims#L220-L221, as if there is inlining, the compiler might run out of registers.

You might also want to ensure you have enough register by passing -fomit-frame-pointer to secp256k1 building. Cryptographic libraries need all the registers they can get, and not passing this will incure a large performance penalty:

https://www.phoronix.com/review/fedora-frame-pointer/2

image

Certainly, but it may not help find out the issue. I have no access to that type of AMD cpu powered machine that produced the error we faced. My colleague has that one. On my Intel with gcc 12 on Ubuntu I can compile even the asm inline part, tests are just seems fine. Also another issue that in nim-waku we compilee with gcc so even clang would compile it it may not help. What I can do in addition to this workaround to point our dependency to a separate branch here and thus to avoid main line would have this forced avoid of that specific asm inline code. But I'm curious if other project depending on this version of nim-secp256k1 - I suppose - would fail to compile in similar cirtumstances.

tersec commented 7 months ago

Oppose this "quick workaround" from landing in this library. It would come with visible perf regressions for nimbus-eth2.

tersec commented 7 months ago

Certainly, but it may not help find out the issue. I have no access to that type of AMD cpu powered machine that produced the error we faced. My colleague has that one.

This should be reproducible by specifying the correct -march flags to the correct gcc version (which has been specified above), even if one doesn't have said CPU, as it will "cross-compile" of sorts regardless.

NagyZoltanPeter commented 7 months ago

Thanks for the tips. I accept the perf concerns. Will try to come back with more proper analysis and/or issue a ticket of possible on the vendor repo. I'm not sure about nimbus, but waku supposed to be run on different platforms, hence the workaround till can have fix. So I can imagine a solution we can use that does not influence other users.

mratsim commented 7 months ago

Certainly, but it may not help find out the issue. I have no access to that type of AMD cpu powered machine that produced the error we faced. My colleague has that one. On my Intel with gcc 12 on Ubuntu I can compile even the asm inline part, tests are just seems fine.

The issue is about "error: ‘asm’ operand has impossible constraints" which means that GCC fails to find enough registers. If you compile with Clang and it fails you will have a better error message.

Every step, I outline will help you pinpoint the issue, this is something I dealt with on a regular basis: remove this -fomit-frame-pointer from my assembly code and the compiler will throw the "impossible constraints" error: https://github.com/mratsim/constantine/blob/976c8bb/constantine/math/arithmetic/assembly/limbs_asm_modular_x86.nim#L23-L24

Also the issue is not AMD specific, there is no AMD vs Intel, all x86-64 CPUs have the same compilation path. You need the same Ubuntu version, Ubuntu changed the defaults of GCC on 24.04. If your colleague is using a preview version, it's normal they have the issue and not you.

https://www.phoronix.com/forums/forum/software/distributions/1428123-ubuntu-24-04-lts-to-enable-frame-pointers-by-default-for-better-profiling-debugging

If you want to test on your machine, use --passC:-fno-omit-frame-pointer and confirm whether you get the same error message or not.

tersec commented 5 months ago

@mratsim nimbus-eth2 explicitly enables frame pointers:

# omitting frame pointers in nim breaks the GC
# https://github.com/nim-lang/Nim/issues/10625
switch("passC", "-fno-omit-frame-pointer")
switch("passL", "-fno-omit-frame-pointer")

to work around another Nim issue.

tersec commented 5 months ago

For anyone experiencing this from nwaku, nimbus-eth1, or nimbus-eth2: https://github.com/status-im/nimbus-eth2/issues/6324#issuecomment-2144588805 describes how to find what -march=native resolves to. I've tried repro'ing this with

# gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

by explicitly selecting either -march=znver2 or -march=znver3 by docker pull ubuntu:22.04, so there's some other necessary condition.

nwaku also uses -march=native by default as does nimbus-eth1 (which also explicitly enables frame pointers for the same reason as nimbus-eth2).

mratsim commented 4 months ago

Confirmed upstream, the library should be compiled with -fomit-frame-pointer otherwise GCC runs out of registers: https://github.com/bitcoin-core/secp256k1/pull/846#issue-739202153

tersec commented 3 weeks ago