weidai11 / cryptopp

free C++ class library of cryptographic schemes
https://cryptopp.com
Other
4.87k stars 1.51k forks source link

LTO build fails due to missint "-m" flags in linker command #865

Closed etam closed 5 years ago

etam commented 5 years ago

Full compilation log: http://susepaste.org/view//9613298

In short: -march=armv7-a -mfpu=neon flags are added to compile selected source files. Linking fails with LTO enabled, because those flags are not specified.

noloader commented 5 years ago

Thanks @etam,

I have not encountered this issue before. I'm going to setup a test machine using a BananaPi and Tumbleweed. Hang tight.

The build log shows this:

...
749.   [  274s] make: Nothing to be done for 'static'.
750.   [  303s] /usr/lib/gcc/armv7hl-suse-linux-gnueabi/9/include/arm_neon.h: In function 'LEA_Decryption'
...

LEA_Decryption is compiled during lea_simd.cpp at line 606 in the log file

606.   [  156s] g++ -DNDEBUG -O2 ... -pthread -fopenmp -march=armv7-a -mfpu=neon -c lea_simd.cpp

lea_simd.cpp was compiled with -march=armv7-a -mfpu=neon. I am suffering a disconnect.

Is it possible to determine what line 750 is referring to?

noloader commented 5 years ago

I have not encountered this issue before. I'm going to setup a test machine using a BananaPi and Tumbleweed. Hang tight.

Oh, man. Bad choice of words. I am hanging after U-boot when the kernel is loading (or has loaded). Ugh.

I'm using the JeOS image from HCL:BananaPi | Tumbleweed | JeOS image.

Is there some sort of safe mode to get this board to boot?

noloader commented 5 years ago

@etam,

I tried the HCL:BeagleBone Black image for my BBB board. That image did not appear to boot. I don't have a micro-HDMI cable, so I did not get to see display output. But none of the LEDs flashed like they do with a Debian image.

I have a spare Tinker Board. openSUSE has a page at HCL:Tinker Board but it lacks an image.

Can you kick-off an image build for me to download?

noloader commented 5 years ago

@etam,

I found one bug related to openSUSE make flags we use. It was fixed at Commit 058a59814f0b.

I'm trying to open a bug report to get the change added to the libcryptopp-7.0.0 package. The openSUSE bug reporter and the "Reset Password" looks like it is bent. I cannot scroll down to finish the password recovery. (I'm on a laptop with a regular screen; not a tablet with a touchscreen).

image

The screenshot is from Windows 8.1, Firefox 68.0.

Would you please report it to the webmaster at openSUSE.

noloader commented 5 years ago

@etam,

I don't know if this is good or bad, but I cannot duplicate the issue under Fedora 30 x86_64. I just ran through the build process using openSUSE flags, and everything compiled and linked. In fact, I wrote a wiki article covering the procedure at Link Time Optimization.

Also, according to http://susepaste.org/view//9613298, it looks like the build lacks AR=gcc-ar and RANLIB=gcc-ranlib. That is, the build should proceed as follows when using GCC:

AR=gcc-ar
RANLIB=gcc-ranlib
CXXFLAGS='-DNDEBUG -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -flto=6 -g -fpic -fPIC -pthread -fopenmp'
make -j 6

I don't know what Clang uses at the moment.

noloader commented 5 years ago

@etam,

I got to do some more testing today. I was not able to get openSUSE image to run. However, I was able to test under Armbian and Debian.

So, using your CXXFLAGS worked as expected for Fedora 30 x86_64 and Armbian 18.04 Aarch64. However, things broke on Armbian 18.04 ARMv7. Armbian 18.04 ARMv7 uses GCC 7.2, but I think it shows the problems with the compiler.

Here are the ARMv7 results:

RANLIB=gcc-ranlib AR=gcc-ar CXXFLAGS="-DNDEBUG -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -flto=6 -g -fpic -fPIC -pthread -fopenmp" make
...
g++ -o cryptest.exe -DNDEBUG -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -flto=6 -g -fpic -fPIC -pthread -fopenmp adhoc.o test.o bench1.o bench2.o bench3.o datatest.o dlltest.o fipsalgt.o validat0.o validat1.o validat2.o validat3.o validat4.o validat5.o validat6.o validat7.o validat8.o validat9.o validat10.o regtest1.o regtest2.o regtest3.o regtest4.o ./libcryptopp.a  -lgomp
pubkey.h:640:26: warning: type ‘struct TF_ObjectImpl’ violates the C++ One Definition Rule [-Wodr]
 class CRYPTOPP_NO_VTABLE TF_ObjectImpl : public TF_ObjectImplBase<BASE, SCHEME_OPTIONS, KEY_CLASS>
                          ^
pubkey.h:640:26: note: a different type is defined in another translation unit
 class CRYPTOPP_NO_VTABLE TF_ObjectImpl : public TF_ObjectImplBase<BASE, SCHEME_OPTIONS, KEY_CLASS>
                          ^
pubkey.h:651:11: note: the first difference of corresponding definitions is field ‘m_trapdoorFunction’
  KeyClass m_trapdoorFunction;
           ^
pubkey.h:651:11: note: a field of same name but different type is defined in another translation unit
  KeyClass m_trapdoorFunction;
           ^
...

Searching GCC Bugzilla reveals 0 bugs for "LTO ODR". I'll likely never be able to create a minimum reproducer for the issue to report it at the GCC Bugzilla. It will likely stay broke for years.

At this point, I think it is best to avoid LTO when using GCC for ARM platforms since it is known bad. And avoid LTO when using Clang for all platforms. It is also known bad.

plater commented 5 years ago

Am updating openSUSE to 8.2.0 and have incorporated : 0001-Fix-TCXXFLAGS-using-openSUSE-standard-flags-GH-865.patch See https://build.opensuse.org/package/show/home:plater/libcryptopp

etam commented 5 years ago

I'm also at https://build.opensuse.org/package/show/home:etamPL:branches:devel:libraries:c_c++/libcryptopp

@plater Your home repo has fewer distro/arch repositories, so it'll be more difficult to tell if patch is really working.

noloader commented 5 years ago

@plater,

You may want this one too: commit 417fbd719ab6. It fixed a copy/paste error in GNUmakefile-cross that was introduced with 0001-Fix-TCXXFLAGS-using-openSUSE-standard-flags.

plater commented 5 years ago

Added 0001-Fix-missing-if-statement.patch and pushed to devel:libraries:c_c++, I've test built with armv7l and aarch64.

noloader commented 5 years ago

@plater,

Forgive my ignorance. Are there additional patches being used by openSUSE? If so I'd like to look at them.

(If there are additional patches, then they may not be needed. We try to stay on top of distro patches, and incorporate them so you don't have to fuss with them in a subsequent release).

noloader commented 5 years ago

@plater,

I can't find the thread I'm looking for to reply to you on, but ...

Regarding -march=armv7-a -mfpu=neon for ARM, be careful of CXXFLAGS += -march=armv7-a -mfpu=neon on down-level ARM devices. GCC will select code from ARMv7-a and use it unconditionally. On a down-level device, like an old Android tablet, that will most likely cause a SIGILL.

Crypto++ compiles some source files with -march=armv7-a -mfpu=neon on ARM. There are about 22 source files, iirc. The code paths are guarded at runtime with HasARMv7() (and friends like HasNEON()) to avoid the SIGILL. It usually looks something like this in the source code (with some hand waiving):

if (HasNEON())
{
    // lea_simd.cpp, compiled with CXXFLAGS + -march=armv7-a -mfpu=neon
    // NEON implementation, multiple blocks at a time
    LEA_Enc_AdvancedProcessBlocks_NEON(const byte *inBlock, const byte *xorBlock, byte *outBlock, size_t blocks);
}
else
{
    // lea.cpp, compiled with CXXFLAGS
    // C++ implementation, one block at a time
    LEA::Enc::ProcessAndXorBlock(const byte *inBlock, const byte *xorBlock, byte *outBlock)
}

I can only say "be careful of..." at the moment because I don't know openSUSE build system well enough to say something more useful about it.

etam commented 5 years ago

I think that for this package, LTO should be disabled.

I found this in gcc manual, under -flto:

There are some code generation flags preserved by GCC when generating bytecodes, as they need to be used during the final link. Currently, the following options and their settings are taken from the first object file that explicitly specifies them: -fPIC, -fpic, -fpie, -fcommon, -fexceptions, -fnon-call-exceptions, -fgnu-tm and all the -m target flags.

noloader commented 5 years ago

@etam,

I think that for this package, LTO should be disabled.

Yeah, that sounds like a good idea.

And for Crypto++, LTO seems to slow things down a bit: Link Time Optimization | Performance. A drift of about 2 - 3 MB/s for the measurement is about normal (give or take). A loss of 32 MB/s in Geometric Throughput is abnormal. It indicate an overall slowdown.

noloader commented 5 years ago

@etam, @plater,

Would openSUSE mind picking up this patch, too: Commit ce6d3c1306cf. It clears an interop problem after we changed ECIES to align with Bouncy Castle and Botan.

This is the important part, from gfpcrypt.h. The other parts are just regression tests.

    size_t GetSymmetricKeyLength(size_t plaintextLength) const
-        {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
+        {return plaintextLength + static_cast<size_t>(MAC::DEFAULT_KEYLENGTH);}

The change will be available in Crypto++ 8.3.

plater commented 5 years ago

@plater,

Forgive my ignorance. Are there additional patches being used by openSUSE? If so I'd like to look at them.

(If there are additional patches, then they may not be needed. We try to stay on top of distro patches, and incorporate them so you don't have to fuss with them in a subsequent release).

Apart from the git patches I also use libcryptopp-shared.patch.txt to facilitate the installation of the libraries in /usr/lib64 for x86_64 and /usr/lib for 32 bit. The patch also increases the libs SONAME to major.minor.patch from .major

noloader commented 5 years ago

Thanks @plater.

I can probably make this change if you want:

-LIBDIR := $(PREFIX)/lib
+LIBDIR := $(PREFIX)/$(LIB)

When I need to install on Fedora, I usually use make ... PREFIX=/usr LIBDIR=/usr/lib64. But the change with a default that preserves behavior should be easy enough.


Is this how we should be doing things:

-SOLIB_FLAGS=-Wl,-soname,libcryptopp.so$(SOLIB_COMPAT_SUFFIX)
+SOLIB_FLAGS=-Wl,-soname,libcryptopp.so$(SOLIB_VERSION_SUFFIX)

If this follows the GNU Coding Standards, then I think we should align with the standards.


I just added this change under Issue 866:

-cryptest.exe:libcryptopp.a $(TESTOBJS)
-   $(CXX) -o $@ $(strip $(CXXFLAGS)) $(TESTOBJS) ./libcryptopp.a $(LDFLAGS) $(LDLIBS)
+cryptest.exe: libcryptopp.so$(SOLIB_VERSION_SUFFIX) $(TESTOBJS)
+   $(CXX) -o $@ $(strip $(CXXFLAGS)) $(TESTOBJS) -L. -lcryptopp $(LDFLAGS) $(LDLIBS)
plater commented 5 years ago

I originally made the soname change due to a clementine bug here's my changes file entry:

plater commented 5 years ago

@etam, @plater,

Would openSUSE mind picking up this patch, too: Commit ce6d3c1306cf. It clears an interop problem after we changed ECIES to align with Bouncy Castle and Botan.

This is the important part, from gfpcrypt.h. The other parts are just regression tests.

    size_t GetSymmetricKeyLength(size_t plaintextLength) const
-        {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
+        {return plaintextLength + static_cast<size_t>(MAC::DEFAULT_KEYLENGTH);}

The change will be available in Crypto++ 8.3.

This patch fails to apply to v8.2.0 due to added comments so far.

noloader commented 5 years ago

This patch fails to apply to v8.2.0 due to added comments so far.

Ack. How about this one. It is based on the tagged version of Crypto++ 8.2:

$ git diff
diff --git a/gfpcrypt.h b/gfpcrypt.h
index 1b26a56b..e0102df0 100644
--- a/gfpcrypt.h
+++ b/gfpcrypt.h
@@ -704,7 +704,7 @@ public:

     bool ParameterSupported(const char *name) const {return strcmp(name, Name::EncodingParameters()) == 0;}
     size_t GetSymmetricKeyLength(size_t plaintextLength) const
-        {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
+        {return plaintextLength + static_cast<size_t>(MAC::DEFAULT_KEYLENGTH);}
     size_t GetSymmetricCiphertextLength(size_t plaintextLength) const
         {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
     size_t GetMaxSymmetricPlaintextLength(size_t ciphertextLength) const
noloader commented 5 years ago

@etam,

The rules for versioning are at: https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html

I remember that page. I walked away after I read some of their comments.

You will probably need to keep maintaining that part of the patch. Sorry about that.

plater commented 5 years ago

This patch fails to apply to v8.2.0 due to added comments so far.

Ack. How about this one. It is based on the tagged version of Crypto++ 8.2:

$ git diff
diff --git a/gfpcrypt.h b/gfpcrypt.h
index 1b26a56b..e0102df0 100644
--- a/gfpcrypt.h
+++ b/gfpcrypt.h
@@ -704,7 +704,7 @@ public:

     bool ParameterSupported(const char *name) const {return strcmp(name, Name::EncodingParameters()) == 0;}
     size_t GetSymmetricKeyLength(size_t plaintextLength) const
-        {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
+        {return plaintextLength + static_cast<size_t>(MAC::DEFAULT_KEYLENGTH);}
     size_t GetSymmetricCiphertextLength(size_t plaintextLength) const
         {return plaintextLength + static_cast<size_t>(MAC::DIGESTSIZE);}
     size_t GetMaxSymmetricPlaintextLength(size_t ciphertextLength) const

The build fails with: [ 369s] /usr/lib/gcc/armv7hl-suse-linux-gnueabi/9/include/arm_neon.h: In function 'LEA_Decryption': [ 369s] /usr/lib/gcc/armv7hl-suse-linux-gnueabi/9/include/arm_neon.h:4835:48: fatal error: You must enable NEON instructions (e.g. '-mfloat-abi=softfp' '-mfpu=neon') to use these intrinsics. [ 369s] 4835 | return (uint32x4_t)builtin_neon_vshl_nv4si ((int32x4_t) a, __b); [ 369s] | ^ [ 369s] compilation terminated.

I don't want to mess with the build flags

noloader commented 5 years ago

I don't want to mess with the build flags

I know the feeling... No problems.

plater commented 5 years ago

Won't build for arm7 without the CXXFLAGS -march=armv7-a -mfpu=neon final mix is in: https://build.opensuse.org/package/show/devel:libraries:c_c++/libcryptopp I'm going to wait for a few days before submitting to the distribution, any comments appreciated.

noloader commented 5 years ago

@plater,

Won't build for arm7 without the CXXFLAGS -march=armv7-a -mfpu=neon ...

Yeah, I believe you need to disable LTO for ARMv7. Also see Link Time Optimization on the Crypto++ wiki.

openSUSE's announcement of enabling LTO and Red Hat's contemplation of enabling LTO is being discussed on the GCC-dev mailing list. I hinted that LTO on ARM is bent or broken. Also see Can LTO minor version be updated in backward compatible way ? on the GCC mailing list.

noloader commented 5 years ago

I've been trying to duplicate this with a reproducer for a GCC bug report. No joy at the moment.

I think there are two problems. First, I only have access to GCC 6.3 on ARM; not GCC 9. Second, the reproducers are too simple.

Regarding "too simple", I think what needs to happen is, x.o has a SIMD function body. GCC decides to optimize it at link time, so it recompiles x.cpp. My reproducer's "too simple" functions don't have function bodies in x.o when looking at them with objdump --disassemble.

plater commented 5 years ago

Managed to filter out the -flto= flag for the armv7l build, I no longer have to add extra build flags.

etam commented 5 years ago

@plater Here you have a recipe for easier disabling of LTO in spec file: https://en.opensuse.org/openSUSE:LTO#LTO_disablement_in_a_SPEC_file