aklomp / base64

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
BSD 2-Clause "Simplified" License
866 stars 162 forks source link

With gcc 8.2 won't built due to __x86.get_pc_thunk.bx discarded #50

Closed htot closed 2 years ago

htot commented 5 years ago

With gcc 8.2 with a i686 target the following error is generated:

`__x86.get_pc_thunk.bx' referenced in section `.text' of lib/libbase64.o: defined in discarded section `.text.__x86.get_pc_thunk.bx[__x86.get_pc_thunk.bx]' of lib/libbase64.o

I found 2 ways to fix this:

  1. add -fno-pie to CFLAGS in Makefile
  2. add __x86.get_pc_thunk.bx to exports.txt

With 1 we get:

root@edison:~/base64# OMP_THREAD_LIMIT=1 ./benchmark 
Filling buffer with 10.0 MB of random data...
Testing with buffer size 10 MB, fastest of 10 * 1
plain   encode  82.34 MB/sec
plain   decode  134.75 MB/sec

but no address space layout randomization (ASLR) and with 2:

plain   encode  77.88 MB/sec
plain   decode  81.43 MB/sec

My vote goes to 1.

aklomp commented 5 years ago

Thanks for the report, and for taking a stab at fixing it. I'm unsure of how to proceed though. The issues I have with the proposed solutions:

  1. Fixes it, but at a high cost. I don't believe that this library should dictate high impact policies like disallowing position-independent code for the whole project it's linked against. Keep in mind that this library is intended to be linked with a larger project, so it should be very humble and unassuming.

  2. A GCC-specific hack that could have unintended consequences. The whole idea of the exports list is to export the absolute minimum, and in this solution it would be expanded to include some compiler-specific function that the library has no control over, and which could be arbitrarily renamed in the future. The exported function could clash with another definition from elsewhere. That's not great. Also, the performance goes down because, I assume, there's an extra indirection before calling every function.

What I think is that this issue should be solved in the outer project that includes this library. That project should either export CFLAGS=-fno-pie when building the library, or it should provide its own definition of __x86.get_pc_thunk.bx, which it should if it's compiled position-independently.

htot commented 5 years ago

As it is, you can't build benchmark nor test_base64. That doesn't sound alright.

As I understand __x86.get_pc_thunk.bx is needed and provided by the compiler in certain (our) cases and is not needed any where else then in libbase64.o. Therefore it is not provided anywhere else.

I believe __x86.get_pc_thunk.bx on entry of certain functions, explaining the performance cost.

htot commented 5 years ago

On x86_64 even -fno-pie does not suffice. Building with -fPIC does.

aklomp commented 5 years ago

Let's take this one step at a time. So this library is intended to be included and built by some outer project. As I see it, the outer project is responsible for supplying any environment CFLAGS that are required to make the software for that specific project on its specific platform, and that includes any nonstandard flags for the platform compiler. I would consider it misplaced and user hostile for a small library like this to mandate that the outer software must use -fPIC. Generating position-independent code is not a small favor to ask, it can have many repercussions.

That the benchmarks and the tests don't build is a problem, but as I see it, the way to fix that is by making these benchmarks and tests proper users of the library. The tests and benchmarks should be set up as users of the library like any other. They should set their own CFLAGS and they can build with -fPIC if they want. Currently this is not how things are set up, but I think that that would solve this strange situation where those programs are children of the library instead of vice versa.

Moving on, I tried to reproduce this bug with the GCC 9.1.0 on my machine and couldn't, everything works as normal. It looks like GCC might have reverted the earlier puzzling and very un-C-like behavior where functions depend on "hidden" symbols that need to be in the global namespace.

All in all I'm still not sure of whether this should be fixed, and if so, what way would be best. I'm inclined to write it off as GCC weirdness, and tell people to avoid building with GCC 8.2.

htot commented 5 years ago

GCC 8.2 is the compiler on Yocto Thud.

aklomp commented 4 years ago

I expect that this will be fixed in #54, by changing the way that symbol visibility is handled.

aklomp commented 2 years ago

@htot Is this issue still relevant?

htot commented 2 years ago

Ah very good question. I currently don't see any issues building on / for x86_64 neither with Ubuntu 22.04 nor for Yocto Honnister. I will test this weekend for Yocto Honister with i686 and report back.

htot commented 2 years ago

After struggling a bit I built with Yocto Honister for i686 and found no issues.