libtom / libtomcrypt

LibTomCrypt is a fairly comprehensive, modular and portable cryptographic toolkit that provides developers with a vast array of well known published block ciphers, one-way hash functions, chaining modes, pseudo-random number generators, public key cryptography and a plethora of other routines.
https://www.libtom.net
Other
1.57k stars 461 forks source link

Improve SSE4.1/AES-NI support #644

Closed tbvdm closed 2 months ago

tbvdm commented 8 months ago

The CryptX Perl module contains a vendored copy of libtomcrypt. It uses -msse4.1 -maes to enable AES-NI support. The problem is that these flags are used for all source files. This results in SIGILL crashes on CPUs without SSE4.1.

I think the solution is for CryptX to use -msse4.1 -maes only for aesni.c. But this is not possible without several changes to libtomcrypt. This pull request is a proposal for those changes.

The primary change is in how SSE4.1 is detected. Currently libtomcrypt checks __SSE4_1__. But this requires that -msse4.1 is used for multiple source files. Moreover, it's a compile-time check, not a run-time one. I think it would be better to detect SSE4.1 with cpuid.

Then there is the build procedure. I came up with the following approach. By default, AES-NI support is disabled. To enable it, add -DLTC_AES_NI to CFLAGS and set the required compiler flags in CFLAGS_AES_NI (typically -maes -msse4.1). For example:

make -f makefile.unix CFLAGS=-DLTC_AES_NI CFLAGS_AES_NI="-maes -msse4.1"

Checklist

Closes #641

tbvdm commented 8 months ago

Maybe I should cc @karel-m

sjaeckel commented 8 months ago

Did you try the proposed change of #641 and whether this maybe fixes it already?

tbvdm commented 8 months ago

641 doesn't help, unfortunately. Here is a crash from the fix-perl-cryptx-99 branch on a machine without SSE4.1 (running OpenBSD):

$ git clone -b fix-perl-cryptx-99 https://github.com/libtom/libtomcrypt.git
$ cd libtomcrypt
$ make -f makefile.unix CFLAGS="-maes -msse4.1"
$ cd ..
$ cat test.c
#include <tomcrypt.h>

int main(void)
{
    register_cipher(&aes_desc);
    gcm_test();
}
$ cc -I libtomcrypt/src/headers libtomcrypt/libtomcrypt.a test.c
$ ./a.out
Illegal instruction (core dumped)
$ egdb -q a.out a.out.core
Reading symbols from a.out...
(No debugging symbols found in a.out)
[New process 186382]
Core was generated by `a.out'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00000a534708a0f0 in gcm_test ()
(gdb) x/i $pc
=> 0xa534708a0f0 <gcm_test+1696>:   ptest  %xmm0,%xmm0
HaikalAzaim commented 5 months ago

Did you try the proposed change of https://github.com/libtom/libtomcrypt/pull/641 and whether this maybe fixes it already?

sjaeckel commented 5 months ago

Sorry for the long delay.

641 doesn't help, unfortunately. Here is a crash from the fix-perl-cryptx-99 branch on a machine without SSE4.1 (running OpenBSD):

$ git clone -b fix-perl-cryptx-99 https://github.com/libtom/libtomcrypt.git
$ cd libtomcrypt
$ make -f makefile.unix CFLAGS="-maes -msse4.1"
$ cd ..
$ cat test.c
#include <tomcrypt.h>

int main(void)
{
  register_cipher(&aes_desc);
  gcm_test();
}
$ cc -I libtomcrypt/src/headers libtomcrypt/libtomcrypt.a test.c
$ ./a.out
Illegal instruction (core dumped)
$ egdb -q a.out a.out.core
Reading symbols from a.out...
(No debugging symbols found in a.out)
[New process 186382]
Core was generated by `a.out'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x00000a534708a0f0 in gcm_test ()
(gdb) x/i $pc
=> 0xa534708a0f0 <gcm_test+1696>: ptest  %xmm0,%xmm0

FMU that's a different problem. You can also see that it fails in gcm_test() and not in AES related code.

A valid test-case for AES-NI would be to first configure&build the library as you did and then execute this:

#include <tomcrypt.h>

int main(void)
{
    aes_test();
}

Regarding what you're seeing: You're explicitly enabling SSE4.1, which allows the compiler to generate SSE-specific instructions.

WARNING: I checked the disassembly in the past and was too lazy to re-check now, so I hope the following statement is still correct :) Please correct me if I'm wrong!

The following code comes from an era when compilers were not yet that smart:

https://github.com/libtom/libtomcrypt/blob/f7e6519fae1e11ff5ff9d36c84101a673002133b/src/encauth/gcm/gcm_mult_h.c#L19-L41

By enabling -msse4.1 at compile time you did not enable that "assembly optimized path" in the source code (via the define etc.), but the compilers got so smart that they generate exactly those assembly instructions from the "default code path" (even when LTC_GCM_TABLES_SSE2 is not defined).

You should also be able to reproduce this failure with the latest release (master branch) that has no AES-NI support yet.

Maybe also try to add debug information when building the library to see where exactly it breaks? make -f makefile.unix CFLAGS="-maes -msse4.1 -g3"

Yes, these details are only slightly related to the error you see, but should show you the problem.

sjaeckel commented 5 months ago

And also you're right with this PR... now I understand the purpose :)

With this we have now a "proper" runtime check for AES-NI. It allows to build the rest of the library for all amd64 (also non SSE) and only requires the AES-NI related parts to be built with SSE4.1 support.

tbvdm commented 5 months ago

Sorry for the long delay.

No worries. Thanks for looking at this.

FMU that's a different problem. You can also see that it fails in gcm_test() and not in AES related code.

[...]

Regarding what you're seeing: You're explicitly enabling SSE4.1, which allows the compiler to generate SSE-specific instructions.

That's precisely the point: currently, if you want to enable AES-NI support, you need to enable SSE4.1 unconditionally for the whole source tree.

This is especially relevant when building binary packages. The same binary package should (ideally) be able to run different CPUs with different capabilities.

And also you're right with this PR... now I understand the purpose :)

With this we have now a "proper" runtime check for AES-NI. It allows to build the rest of the library for all amd64 (also non SSE) and only requires the AES-NI related parts to be built with SSE4.1 support.

Yes, exactly. :)

tbvdm commented 5 months ago

I've force-pushed a new commit that includes the change you requested.

tbvdm commented 5 months ago

Why don't we use target attributes? See the new commit. This avoids the need for flags altogether. You only have to make sure that LTC_AES_NI is defined. For example:

make -f makefile.unix CFLAGS=-DLTC_AES_NI

I defined __attribute__ away for MSVC. Also, MSVC does not need flags or attributes to enable SSE4.1 and AES intrinsics.

(However, I don't think MSVC will work anyway, because apparently it does not support inline assembly on x64. So the cpuid call will fail. I think the solution is to call cpuid through an intrinsic.)

tbvdm commented 3 months ago

I've pushed a new commit that is rebased on current develop, squashes the two commits in this PR, and includes @levitte's suggestions.

tbvdm commented 3 months ago

Done!

sjaeckel commented 3 months ago

This also supersedes #641 right?

tbvdm commented 3 months ago

Yes, I believe it does.

sjaeckel commented 2 months ago

Before merging this I have one more question:

Wouldn't it now make sense to enable this by default on x86/amd64? If yes, we should change the compile-time-switch to disable if not desired.

Btw. thanks (and sorry) @tbvdm for doing this lengthy exercise with us ;)

karel-m commented 2 months ago

@tbvdm could you please rebase this PR's branch on top of current develop? I will try to make a dev release of CryptX perl bindings with this patch to check what will happen on cpantesters & co.

sjaeckel commented 2 months ago

Wouldn't it now make sense to enable this by default on x86/amd64?

@karel-m @levitte I'd also like your opinion on this

karel-m commented 2 months ago

Wouldn't it now make sense to enable this by default on x86/amd64?

I do not have a strong opinion on this but I would rather keep it off by default,

According to my experiance you basically need:

levitte commented 2 months ago

I think I'm leaning on the cautious side re the default. @karel-m said it a lot better

sjaeckel commented 2 months ago

Alright, then we'll stay away from the danger zone and leave it off by default ;)

@tbvdm I took the liberty to rebase and force-push your branch, I hope that's OK

sjaeckel commented 2 months ago

I guess we forgot to enable this in the CI build ... ;)

https://github.com/libtom/libtomcrypt/blob/ce904c86efae8ecbf167e3e20babdd4cfd5511d7/.github/workflows/main.yml#L41

tbvdm commented 2 months ago

Thanks for merging this. :)

Yes, it seems I missed main.yml when I grepped the tree for relevant files.