cloudflare / sslconfig

Cloudflare's Internet facing SSL configuration
BSD 3-Clause "New" or "Revised" License
1.3k stars 132 forks source link

Shared library linking - reloc errors due to non-local jmp in poly1305_avx2.pl #34

Closed anirudhvr closed 7 years ago

anirudhvr commented 8 years ago

Hi @vkrasnov - I'm getting errors of this form linking with gcc 4.9. The library builds fine but when I'm compiling another program with libcrypto, errors like the following happens (everything is linked with -fPIC, so it's not quite as simple as the diagnostic makes it seem)

binutils/bin/gold/ld: error: build/openssl/lib/libcrypto_pic.a(poly1305_avx2.o): requires dynamic R_X86_64_PC32 reloc against 'poly1305_update_x64' which may overflow at runtime; recompile with -fPIC collect2: error: ld returned 1 exit status

It appears that the two jmps to poly1305_update_x64 and poly1305_finish_x64 in poly1305_avx2.pl cannot be relocated, and -fPIC doesn't have an effect on assembly code.

I'm no expert, but I looked around a bit and it seems there's ways to calculate the offset relative to the GOT/PLT when doing the jump.

Alternatively, why not just do this in code, i.e., something like this (in e_chacha20poly1305.c) and remove the jmps from the asm? This seems to work but I'd like your thoughts before I send a PR.

`static void poly1305_update_avx2_base(poly1305_state state, const uint8_t in, size_t in_len) { if (in_len >= 512) { poly1305_update_avx2(state, in, in_len); } else { poly1305_update_x64(state, in, in_len); } }

static void poly1305_finish_avx2_base(poly1305_state* state, uint8_t mac[16]) { // In assembly, you check whether 8*7($state) == 0 if ((uint64_t)state[56] == 0) { poly1305_finish_x64(state, mac); } else { poly1305_finish_avx2(state, mac); } }

static void EVP_chacha20_poly1305_cpuid(EVP_CHACHA20_POLY1305_CTX _ctx) { if ((OPENSSL_ia32caploc()[1] >> 5) & 1) { / AVX2 _/ ctx->poly1305_init_ptr = poly1305_initx64; / Lazy init */ ctx->poly1305_update_ptr = poly1305_update_avx2_base; ctx->poly1305_finish_ptr = poly1305_finish_avx2_base; `

vkrasnov commented 8 years ago

I think doing a PLT relative jump is the better solution

anirudhvr commented 8 years ago

I see; thanks for the reply. Unfortunately, I know little about calculating PLT offsets so I'd have to wait for a patch from you.

Can you confirm if the above patch would/wouldn't work? It does appear that your poly1305_avx2 function sets 8*7($state) to 1 if poly1305_init_avx2 was run and 0 if not, so wouldn't the check for state[8*7] == 0 in C be sufficient?

vkrasnov commented 7 years ago

It should work now.