Code generated using --no-wide-int contains infinite loop

imeckler commented 3 years ago

I ran

./src/ExtractionOCaml/word_by_word_montgomery pasta_fp 64 '2^254 + 45560315531419706090280762371685220353' --no-wide-int

and got output containing the function

/*
 * The function fiat_pasta_fp_mulx_u64 is a multiplication, returning the full double-width result.
 * Postconditions:
 *   out1 = (arg1 * arg2) mod 2^64
 *   out2 = ⌊arg1 * arg2 / 2^64⌋
 *
 * Input Bounds:
 *   arg1: [0x0 ~> 0xffffffffffffffff]
 *   arg2: [0x0 ~> 0xffffffffffffffff]
 * Output Bounds:
 *   out1: [0x0 ~> 0xffffffffffffffff]
 *   out2: [0x0 ~> 0xffffffffffffffff]
 */
void fiat_pasta_fp_mulx_u64(uint64_t* out1, uint64_t* out2, uint64_t arg1, uint64_t arg2) {
  uint64_t x1;
  uint64_t x2;
  fiat_pasta_fp_mulx_u64(&x1, &x2, arg1, arg2);
  *out1 = x1;
  *out2 = x2;
}

which loops forever.

The code without --no-wide-int is correct.

JasonGross commented 3 years ago

Thanks for the report!

What's the thing you're aiming at here?

Documentation for what options are expected to work together is severely lacking. Currently the use-cases for --no-wide-int are 64-bit Go code (the Go backend has a primitive which replaces mulx) and bedrock2 (where using --no-wide-int in conjunction with --split-multiret eliminates uses of mulx, but introduces identifiers that the C backend doesn't support (passing --no-select removes one of them, and I'd need to add support for xor/^ to handle remaining things)).

imeckler commented 3 years ago

I would like to generate C code that uses only 64 bit integers (i.e., works with C compilers that don't know about any 128 bit ints) for the given prime. Which backend is used is not important to me. Is this possible? I can replace this mulx code with some C that works without 128 bit ints, but 128 bit ints are still used throughout the generated file.

EDIT: Actually, looks like they're just used in addcarryx and subborrowx. I can replace these manually as well, but would be good to know if it's already in fiat-crypto.

JasonGross commented 3 years ago

The C code in fiat-bedrock2 does not use 128-bit ints. It's a bit arcane for other reasons. Let me look into generating more standard C code like this as well.

What's the application, if I may ask? (I like knowing where fiat-crypto is used)

JasonGross commented 3 years ago

Note also that if you pass --no-primitives, you can suppress the generation things like addcarryx, etc, so you can provide your own implementations. But in fact you do need some way to add two 64-bit integers with carrying, and some way to multiply two 64-bit integers and get two 64-bit integers out. Whether you do this with compiler intrinsic or bithacks or 128-bit integers is up to you. If you have a suggestion of some implementation for these primitives, I can see if I can add an option to make use of such an alternate implementation.

imeckler commented 3 years ago

The C code in fiat-bedrock2 does not use 128-bit ints. It's a bit arcane for other reasons. Let me look into generating more standard C code like this as well.

What's the application, if I may ask? (I like knowing where fiat-crypto is used)

It's for a C reference implementation of the transaction signer for the Mina blockchain.

It needs to be super portable since it will be used as a starting point for compiling for a variety of embedded systems (for HSMs basically).

imeckler commented 3 years ago

Thank you for all the headses up. The code I replaced it with is not close to optimal, but in case you're interested:

void fiat_pasta_fp_addcarryx_u64(uint64_t* out1, fiat_pasta_fp_uint1* out2, fiat_pasta_fp_uint1 arg1, uint64_t arg2, uint64_t arg3) {
  uint64_t tmp = arg3 + arg1;
  *out1 = arg2 + tmp;
  *out2 = (arg2 > *out1) | (arg3 > tmp);
}

void fiat_pasta_fp_subborrowx_u64(uint64_t* out1, fiat_pasta_fp_uint1* out2, fiat_pasta_fp_uint1 arg1, uint64_t arg2, uint64_t arg3) {
  uint64_t tmp = arg3 + arg1;
  *out1 = arg2 - tmp;
  *out2 = (arg2 < *out1) | (arg3 > tmp);
}

void fiat_pasta_fp_mulx_u64(uint64_t* out1, uint64_t* out2, uint64_t a, uint64_t b) {
  uint64_t    a_lo = (uint32_t)a;
  uint64_t    a_hi = a >> 32;
  uint64_t    b_lo = (uint32_t)b;
  uint64_t    b_hi = b >> 32;

  uint64_t    a_x_b_hi =  a_hi * b_hi;
  uint64_t    a_x_b_mid = a_hi * b_lo;
  uint64_t    b_x_a_mid = b_hi * a_lo;
  uint64_t    a_x_b_lo =  a_lo * b_lo;

  uint64_t    carry_bit = ((uint64_t)(uint32_t)a_x_b_mid +
                          (uint64_t)(uint32_t)b_x_a_mid +
                          (a_x_b_lo >> 32) ) >> 32;

  uint64_t    multhi = a_x_b_hi +
                      (a_x_b_mid >> 32) + (b_x_a_mid >> 32) +
                      carry_bit;

  *out2 = multhi;
  // TODO: This multiplication could be avoided.
  *out1 = a * b;
}

JasonGross commented 3 years ago

This is a bit tricky to do in part because we start with templates that need to work for all integer sizes, not just 64-bit integers. I'd suggest just using the C code in fiat-bedrock2/src/

mit-plv / fiat-crypto

Code generated using --no-wide-int contains infinite loop #901