mirleft / ocaml-nocrypto

OCaml cryptographic library
ISC License
111 stars 53 forks source link

nocrypto.0.5.0 illegal instruction #73

Open toolslive opened 9 years ago

toolslive commented 9 years ago

on a Quad-Core AMD Opteron 2350 we got:

#0  _mm_aeskeygenassist_si128 (__C=<optimized out>, __X=...) at /usr/lib/gcc/x86_64-linux-gnu/4.8/include/wmmintrin.h:88
#1  _nc_aesni_derive_e_key (rounds=<optimized out>, rk0=<optimized out>, key=0x1fe0b30 "") at src/native/aes/aesni.c:106
#2  caml_nc_aes_derive_e_key (key=140673840720632, off1=<optimized out>, rk=140673840720576, rounds=<optimized out>) at src/native/aes/aesni.c:331
#3  0x00000000006c5005 in camlNocrypto__Cipher_block__fun_2213 () at src/cipher_block.ml:371
#4  0x00000000006c4f81 in camlNocrypto__Cipher_block__of_secret_with_1584 () at src/cipher_block.ml:368
#5  0x00000000006c5bf2 in camlNocrypto__Fortuna__create_1298 () at src/fortuna.ml:22
#6  0x00000000006c700c in camlNocrypto__Rng__create_1190 () at src/uncommon.ml:38
#7  0x00000000006c76ba in camlNocrypto__Rng__entry () at src/rng.ml:49
#8  0x0000000000504d19 in caml_program ()

so it's using instructions it shouldn't. Also, it would be nice to have the detection at runtime, instead of at compile-time, as we often don't know where it will be deployed.

hannesm commented 9 years ago

this seems to be a duplicate of #72, fixed in master (which now has configuration-time detection whether AES-NI is available or not). Could you please pin to master and report whether this works for you?

toolslive commented 9 years ago

compile time detection does not solve the issue for us, as we will always be compiling on a different system than where we deploy...

pqwy commented 9 years ago

@toolslive What is your scenario here? Where do you deploy to, and what capabilities do CPUs there have? How unacceptable is it for you to opam pin nocrypto to the release, then opam edit nocrypto to add --disable-modernity to the configure step? Is this a production scenario?

Asking, because this is currently a balancing act: while at some point dynamic detection will land, I want more exposure for this code before complicating it further. So we're stuck with a static decision. The current master unconditionally disables acceleration if you pass --disable-modernity, and if you --enable it, it checks whether the build machine supports the instructions and still disables them if the current host can't run them. configure defaults to --enable.

Ideally, we could export package-level configuration options to opam, and you could tweak your list of configuration variables and have the packages respect them, a scheme Gentoo linux has been using for years. But as it stands, there is no way to let a user directly influence compile-time options of packages through opam (other than pin-and-edit). So unless we start hacking around with virtual packages that denote your preference for CPU acceleration, nocrypto has to stick with a single choice.

This choice can then be either requiring people to manually opt in, or opt out of acceleration. I am trying to feel out the user base here and decide what makes more sense, and I guess it all boils down to how prevalent this scenario is, where the deployment machine lacks features the build machine has, and how inconvenient the current mechanisms of control we have are for the users.

Feedback more than appreciated. I didn't release an update precisely to get a little more info about where we run and how often we break with the current scheme.

toolslive commented 9 years ago

typically what happens is that we build an executable that ends up in a debian or redhat package. that package gets installed on core-i5, core-i7, amd opteron and other intelloids.

so we would prefer something like this:

// gcc extension, run when lib is loaded...
int have_sse_4_2 = 0;

void __cpu_detect(void) __attribute__((constructor));
void __cpu_detect(void) {
  unsigned int a = 0, b = 0, c = 0, d = 0, func = 1;
  int r = 0;
  r = __get_cpuid(func, &a, &b, &c, &d);
  if (r == 0) {
    /* Lookup failed */
    c = 0;
  }
  have_sse_4_2 = (c & bit_SSE4_2) != 0;
}

and then dispatch based on the detected capabilities.

pqwy commented 9 years ago

You mean, a little like this? It's quite clear how to go about detecting features, the problem is that the dispatch phase complicates code paths and, as I said, sufficiently so that it will have to wait a little.

The question still stands: it sounds like whatever you are doing is specific enough -- and certainly non-casual enough -- that the solution of pinning and editing configure line for your particular scenario is workable. Am I mistaken here? Have you tried that?

toolslive commented 9 years ago

--disable-modernity would work.

pqwy commented 9 years ago

I'm still trying to gauge how to proceed here. While it is possible to influence the build with configure flags, I'm not sure that the current state is good. I'm thinking of maybe disabling modernity by default if this causes problems for too many users.

Leaving this open until hopefully some opinions pile up. :smile:

domsj commented 9 years ago

What exactly do you mean with "the dispatch phase complicates code paths" ? As far as I can tell, once the feature detection is done all that is still needed is one if/else. You could even do the dispatch in ocaml if you would prefer that. You could then have tests which encode with one variant en decode with the other (if both are available according to the feature detection).

pqwy commented 9 years ago

Around five just for AES. I am working on an accelerated GCM which adds several more varying entry-points, and actually the OCaml code has to change depending on presence of PCLMUL. Then, there is using SSE to accelerate hashing (which is the current choke point if you are doing authenticated encryption), and this will add several more.

From a different POV, just the argument order of C primitives has a significant impact (~20% as of several months ago) on the overall performance. I need to benchmark how much is lost if there is a dispatch on entry; probably not too much but I expect it to factor.

Then obviously there is the fact that this is portable to ARM, which means that parts of dispatched-to code can be compiled in only conditionally.

Obviously AES itself can indeed be solved by a few switch statements, but the C code was written with being brain-dead-simple as first, second and third priority. So I want to get this out, have a bit of testing and feedback, settle the upcoming wave of changes, and then sit down and figure out how to do this without littering the code with both static and dynamic checks, while having feature-based decisions in one place, and working uniformly for other feature-dependant code. One thing I would currently like to avoid is having feature tests all over the C stubs.

So right now, I would rather disable acceleration in default builds than hack dynamic detection in just so it kinda works.

toolslive commented 9 years ago

maybe the dispatch should not be on function level, but on module level.

pqwy commented 9 years ago

That was my idea, yeah. I wanted to have a few distinct bits of code that have different and feature-dependent code paths, then figure out how to neatly swap out modules for various alternatives.

Just had a chat with @hannesm; he thinks we should optionally take ./configure flags from an env var to make it easier to automate builds with non-default configurations.

dbuenzli commented 9 years ago

Just had a chat with @hannesm; he thinks we should optionally take ./configure flags from an env var to make it easier to automate builds with non-default configurations.

I kind of dislike the implicitness of env vars. There may be a way to subvert opam's config files here. I have not tried this but if you have a look at the .opam/$SWITCH/config/global-config.config I think you can define variables in there and those will become available as opam variables. Maybe we could have some sort of opam config set $VAR $VALUE. I don't know what @AltGr thinks of that. This may not be the best solution in the long term and being able to specify variables at install time may may be a better option.

samoht commented 9 years ago

if you really want to pass a flag to opam via the env variabale, you can already do it. warning: it's ugly.

$ cat opam
# an ugly opam file
opam-version: "1.2"
name: "toto"
version: "0.1"
build: ["echo" FOO]
$ opam pin add toto . -n
$ OPAMVAR_FOO=foo opam install toto -v
[...]
+ echo "foo" (CWD=/Users/thomas/.opam/profuse/build/toto.0.1)
- foo

The opam config set $VAR $VALUE is a much better idea but unfortunately not implemented yet.

pqwy commented 9 years ago

If we could use .opam/$SWITCH/config/global-config.config to set variables, and have packages react to that, that would be absolutely smashing! Especially if we had a user interface for opam variables.

(It's also what Gentoo does, as mentioned, but at some point they went overboard.)

@samoht @AltGr -- thoughts?

samoht commented 9 years ago

You can put things in that file and that will work. You can also put things in <pkg>.config and call pkg:<var> in the opam file directly (doesn't work yet for self references, but will do in the next version).

samoht commented 9 years ago

See http://opam.ocaml.org/doc/manual/dev-manual.html#sec40 and opam config list to see how to use these .config files.

dbuenzli commented 9 years ago

@samoht is this supposed to be generated by the build system so that packages can provide information about their configuration in the opam system for other packages to consult ?

samoht commented 9 years ago

yup. but I guess you can use that to register some user preferences as well (eg. you can create a package crypto-conf with the configuration values and a hack to let the user change these values). But I agree that having opam config set is better than a hack.

pqwy commented 9 years ago

Hm.

If I add a line foo: "bar" to ~/.opam/system/config/global-config.config, I can see it in opam config list via opam config var foo, and I can refer to it from a toto-like opam snippet.

If I create a file ~/.opam/system/config/nocrypto.config and add the same line, It doesn't register as nocrypto:foo. Same for any other package, installed or otherwise.

I've been re-reading the relevant portion of the dev manual, but I can't seem to figure out the central question: are package variables a closed set?

(opam 1.2.2)

samoht commented 9 years ago

Humph, the manual is wrong. The correct path is $prefix/lib/$pkg/opam.config which might not be the best name ever. /cc @AltGr

AltGr commented 9 years ago

ah, the location for config files was changed some time ago from config/pkg.config to lib/pkg/opam.config, and we missed the update in the dev-manual. There was a good reason for the change, but I can't remember what it was -- I think that happened around the time we refactored pinned packages (i.e. in 1.2.0).

Implementing opam config set should be easy enough: the hard part is defining the interface and specifying what it does :) I've already had requests for a way of specifying package configuration variables in advance.

AltGr commented 9 years ago

ah, found it: https://github.com/ocaml/opam/issues/1385

pqwy commented 9 years ago

Thanks, makes much more sense now. :smile: This almost works.

Problem 1:

opam seems to own $prefix/lib/$pkg. The good part is that I can create the directory and opam.config with the variables prior to installing a package, and installation picks up on that. The bad part is that the entire directory is removed on uninstall, which means that variables are gone on reinstall.

Wish: opam config set or such would set variables that would stick without being lost on reinstall.

Problem 2:

I can use the new var to optionally filter out an an argument, with ["--small-animals" {animals}]. But I would like to use a var to give value to an argument, like ["--%{animals}%-animals"] to either mean --small-animals or --medium-sized-animals. I can use the experimental boolean-to-string converter, and say ["--%{animals?medium-sized:small}%-animals"], but then I have to add a boolean to opam.config, like animals: true (-> --medium-sized-animals) and animals: false (-> --small-animals). If I want the user-facing variable to be a string, and get animals: "small" -> --small-animals and animals: medium-sized -> --medium-sized-animals, I need to be able to default it to something so that I don't get ---animals if it's not explicitly set (I think I need to default to --medium-sized-animals after all).

Wish: animals: "medium-sized" would parse in an opam file and the package would see the variable, but the variable would still be user-overridable. (Am I missing a way to default an arbitrary variable?)

AltGr commented 9 years ago

There is the features field, but it is not supposed to be user-overridable (although it can refer to another variable). It was first intended for boolean values but the implementation handles strings too.

pqwy commented 9 years ago

That looks about right, thanks!

But the fact that $prefix/lib/$pkg is deleted on reinstall, together with the custom package variables, means that this form of vars is not yet suited for user-facing configuration tweaks.

I added 444e627a29c0a70b3c6975e00185b9c1765e8027 and plan to release it tomorrow. With that, users can echo "nocrypto-inhibit-modernity: true" >> ~/.opam/$(opam config var switch)/config/global-config.config to get rid of the machine-dependent code.

Going forward, I would like to avoid polluting global-config.config, though.

pqwy commented 9 years ago

0.5.1 is tagged and out, pending merge to the opam repo.

Global opam variable nocrypto-inhibit-modernity can be used to force-disable acceleration without pinning or editing the package.

@toolslive @domsj Clearly not the best solution, but will it do as an interim step?

dbuenzli commented 9 years ago

But the fact that $prefix/lib/$pkg is deleted on reinstall, together with the custom package variables, means that this form of vars is not yet suited for user-facing configuration tweaks.

I don't think that .config file as it stands is the right mechanism. We want to distinguish a) .config file as they exist, that is a description of what was configured for other packages to use and b) being able to specify configuration parameters for a package.

dbuenzli commented 9 years ago

Wish: animals: "medium-sized" would parse in an opam file and the package would see the variable, but the variable would still be user-overridable. (Am I missing a way to default an arbitrary variable?)

As I commented on the opam issue there should be a way to declare and document the variables, having a default value should be added to the list. But this discussion should be continued there.

toolslive commented 9 years ago

for us, that interim option is ok.

dbuenzli commented 9 years ago

@pqwy I have written a proposal there https://github.com/ocaml/opam/issues/2247 your feedback is welcome.

edwintorok commented 9 years ago

You can use -march=native (works with GCC and Clang) to get all the features of the build machine instead of specifying -m<feature> flags individually, however I don't recommend this: it may work fine for systems like gentoo, but can be quite a problem for any OS that distributes binary packages, and virtual machines are another problem. There is also a GCC extension, GNU indirect functions (ifunc attribute) that can perform function dispatch at load time, but unfortunately only works on Linux (and perhaps FreeBSD) and not supported by Clang yet. Its probably possible to achieve something similar using function pointers and perform the dispatch once on startup (instead of load time like with ifunc) in a portable way.