LeventErkok / sbv

SMT Based Verification in Haskell. Express properties about Haskell programs and automatically prove them using SMT solvers.
https://github.com/LeventErkok/sbv
Other
244 stars 34 forks source link

SBV->C: Support arbitrary size bit-vectors, and in general all(?) SBV types #687

Open yav opened 7 months ago

yav commented 7 months ago

I was trying to generate C code for AES from Cryptol, using symbolic simulatino to SBV, but I keep running into variations of this:

SBV->C: Not yet supported: join with ((SWord16,s2127),(SWord8,s2128))

Could we add support for cases like this? I'd imagine they are very common. Or perhaps I did something wrong? I am pretty sure this sort of thing used to work once upon a time.

LeventErkok commented 7 months ago

Looking at the code, it looks like we only support joining (8, 8), (16, 16), and (32, 32); and nothing else. And the reason for that is the results map to "known" types on the C side. (16, 32, 64 bits respectively.)

SBV->C translation avoids using "larger" types to represent smaller quantities. (i.e., representing 24-bit bit-vector with a 32-bit C variable.) Not because you can't do it, but rather it keeps the mapping simple so you don't have to keep masking constantly.

Having said that, there's an AES implementation in the distro that does generate AES 128, 192, 256 code:

https://github.com/LeventErkok/sbv/blob/ddd2c87f280e3bd29d1115b388282ec3a9b2647b/Documentation/SBV/Examples/Crypto/AES.hs#L881-L886

Perhaps this would be sufficient for you? I don't remember all the details, but these were surely coded so they avoid any "unsupported" parts of the SBV->C translator.

yav commented 7 months ago

I see, makes sense. One thing I've done when implementing this sort of thing in the past is to put the smaller bitvectors in the most signifiact part of the larger container word. This allows to avoid maksing in many cases (not always though).

LeventErkok commented 7 months ago

Interesting idea storing it on the top-part. I can see that works for addition, but not shifting or multiplication.

An alternative idea is to use System-C data-types, which has support for arbitrary size bitvectors. (https://systemc.org) It's a bit heavy-weight since it has a ton of other things you don't care about, but it might be an alternative as you don't have to worry about arbitrary bit-sizes. (Though it's a bit annoying since it has a different "class" for words <= 64 bits and > 64 bits, but you can mostly ignore that with some clever coding.)

If someone stripped out all the non-datatype part of system-c and published that as a separate library, I think that'd be kind of cool. Perhaps Galois can put out something like that, if there's sufficient interest.

LeventErkok commented 7 months ago

Closing this ticket since I think the AES implementation that gets shipped with SBV can handle your problem. If not, feel free to re-open.

yav commented 7 months ago

@LeventErkok I am not sure how to reopen, but it would be useful to do this. The goal is not to compile AES specifically, but rather to be able to compile more programs to C.

Btw, multiplication also works pretty easy too, you just need to normalize the one argument. For example x * y : [24] becomes x * (y >> 8). So not 0 cost, but really not that bad.

I agree that having a library for this sort of thing would be quite nice, as I've implemented this kind of thing a bunch of times. Perhaps a good starting point would be defining a small C library with all the operations we want.

LeventErkok commented 7 months ago

Yeah; I'd be open to changing the SBV->C compiler to generate code given such a library comes from a third-party. Otherwise, it's a lot of messy code that is easy to get wrong.

I'm re-opening the ticket; but unless a library of this sort comes along, not much is gonna happen on the SBV side. In fact, it'd be really nice to have a run-time C-library that we link against, and a Haskell library that lets you "create" code using that library in a Haskell idiomatic way. (Right now, it's literally spitting out strings, which works but is rather crappy.)

LeventErkok commented 7 months ago

Of course, once can argue we can try adding support for other types supported by SBV as well:

You almost want an SMTLib->C compiler (sans quantifiers and other non-executable things I suppose), but maybe that's too ambitious and not very useful in the end.

LeventErkok commented 6 months ago

@yav Does this always work?

#include <stdio.h>

// Does d range over -4 to 3?
typedef struct { int8_t d : 3; } SInt3;

int main () {
    SInt3 x, y, z;

    x.d = 2;
    y.d = 3;
    z.d = x.d + y.d;

    printf("%d + %d = %d\n", x.d, y.d, z.d);

    return 0;
}

This prints:

2 + 3 = -3

which is the correct result. Wondering if this is guaranteed; for signed/unsigned values and all kinds of operations. (In particular shifting, rotating, bit-lookup etc.)

I did some googling but didn't see anything definite about what is guaranteed. Most of the commentary is about having multiple bit-fields in the struct and how they're laid out, but in our case we don't really care about it except for the arithmetic.

If this works, then it'd be fairly easy to modify SBV to support bit-vectors sized 1-64. Anything larger will still be problematic, but that might cover most use cases.

LeventErkok commented 6 months ago

From what I can tell over various internet sources, the above is likely to work but there's nothing in the C-spec to actually guarantee this behavior:

Screenshot 2024-05-28 at 11 28 51 AM

Based on this, I'd say it's probably not a good idea to use bit-fields like this.

yav commented 5 months ago

@LeventErkok I started making a C library for doing this kind of stuff, it is here:

https://github.com/yav/cbits

I've done most of the operations for "small" bit vectors (sizes 1 to 64) (see include/small_bits.h).

Let me know what you think.