awslabs / LibMLKEM

Apache License 2.0
15 stars 1 forks source link

Comments on new ntt #4

Open weaversa opened 1 month ago

weaversa commented 1 month ago

This is great work @rod-chapman!

My main suggestion is to add documentation. Cryptol supports doc-strings - every definition should have one that describes the intent of the function along w/ citations to appropriate reference documents. For example:

https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L23-L24 What is BitRev7? Why 7 when the input and outputs are both 8 bits? On the face it looks like there could be a typo. If the function was documented and a reference provided, understanding could be given rather than inferred/reverse engineered.

Otherwise, some nit-picky things --

Cryptol has a robust (@WeekNightMVP 😄) module system. The first line should be something like

module LibMLKEM::spark_ada::experimental::ntt where

Or where ever you want it to live - could be as simple as:

module ntt where

But it should have one!

I suggest being consistent with names and extensions thereof. For example,

https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L8-L9

I suggest either

type Z_q = (Z MLKEM_Q) 
type Z_q_256 = [MLKEM_N]Z_q

or

type Zq = (Z MLKEM_Q) 
type Zq_256 = [MLKEM_N]Zq 

Both camelCase and snake_case are being used. I suggest you pick one. It's ideally best to follow the naming convention in whatever document you are following to create a specification. That way developers will have an easier time using the Cryptol spec along w/ other reference documentation to assist their development work.

Cryptol will soon have the ability to check properties in doc-strings (a la cargo test) -- https://github.com/GaloisInc/cryptol/pull/1682 So, properties like the following: https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L44-L49 could be better supported like so:

/**
 * This property demonstrates that `naive_ntt` is the inverse of `naive_invntt`.
 * ```
 * :prove naive_ntt_inverts
 * ```
 */
 naive_ntt_inverts : Z_q_256 -> Bool 
 property naive_ntt_inverts f = f == naive_invntt (naive_ntt f)

It's also suggested to write properties (when possible) in the style of a rewrite rule. This assists with any future effort where saw would use such a property as a lemma to prove a larger property.

There are unnecessary (and distracting) parens in places. Here is an example: https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L92 This could instead be:

new_v = [ v@x + zeta * v@(x + len) | x <- [ 0 .. < 2^^q ] ] 

Suggest removing miscellaneous newlines, for example: https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L180-L181

Suggest exploring Cryptol's generate syntactic sugar (https://github.com/GaloisInc/cryptol/blob/f5fa503b1898e8bb9dc3358ada036adb4c54b832/docs/RefMan/BasicTypes.rst?plain=1#L161-L163) to simplify some expressions. For example:

https://github.com/awslabs/LibMLKEM/blob/6d192dda3e933632abf8250dbbea5c7e1c3a9ccf/spark_ada/experimental/ntt.cry#L27-L30 Could become:

ParametricNTT' : Z_q_256 -> Zq -> Z_q_256
ParametricNTT' f root = join (transpose [map sum f2, map sum f2Plus1])
  where
    f2, f2Plus1 : [128][128]Zq
    f2@i@j      = f@(2*j)   * root ^^ ((2*(BitRev7 i >> 1)+1)*j)
    f2Plus1@i@j = f@(2*j+1) * root ^^ ((2*(BitRev7 i >> 1)+1)*j)
> :prove \f root -> ParametricNTT' f root == ParametricNTT f root
Q.E.D.
(Total Elapsed Time: 13.740s, using "Z3")

How cool is that!

It might be worthwhile to add a comment saying that coerceSize is necessary here until the issues related to https://github.com/GaloisInc/cryptol/pull/1392 are addressed. That way, someone may be able to remove this blight at a later date.

rod-chapman commented 1 month ago

Many thanks for the comments. Note that some of the code (e.g. the "bitrev" function and the "ParametricNTT" function) were written by Galois (and copied here from the cryptol-specs repo just for ease of testing and stating the equivalence properties.) I won't be modifying that code, but we can pass those comments back to Galois, of course...

rod-chapman commented 1 month ago

I hope to contribute this work to the cryptol-specs repo, so it will naturally fit into their module system.

rod-chapman commented 1 month ago

I have also just commited a significant simplification of the ct_butterfly and gs_butterfly functions - I realized that the additional decomposition in ct_lowerhalf and ct_upperhalf was not really necessary. The specification of ct_butterfly now also closely matches the inner loop of Algorithm 8 in FIPS 203, which is a pleasant side-effect. Reference code updates to match are also now in entt.adb

weaversa commented 1 month ago

Using generate sugar cleans this up some (untested).

gs_butterfly :
    {m, hm}
    (m >= 2, m <= 8, hm >= 1, hm <= 7, hm == m - 1) =>
    [2^^m]Zq -> Zq -> [2^^m]Zq
gs_butterfly v zeta = new_v
  where
    lower, upper : [2^^hm]Zq
    lower@x = v@x  +  v@(x + halflen)
    upper@x = zeta * (v@(x + halflen) - v@x)
    new_v = coerceSize (lower # upper)
rod-chapman commented 1 month ago

Where is the generate thing documented?

weaversa commented 1 month ago

Where is the generate thing documented?

https://github.com/GaloisInc/cryptol/blob/f5fa503b1898e8bb9dc3358ada036adb4c54b832/docs/RefMan/BasicTypes.rst?plain=1#L161-L163

You can also try:

Cryptol> :h generate

    generate : {n, a, ix}
      (Integral ix, LiteralLessThan n ix) =>
      (ix -> a) -> [n]a

Produce a sequence using a generating function.
Satisfies 'generate f @ i == f i' for all 'i' between '0' and 'n-1'.

Declarations of the form 'x @ i = e' are syntactic sugar for
'x = generate (\i -> e)'.
rod-chapman commented 1 month ago

OK... generate syntactic sugar adopted.

pennyannn commented 1 month ago

@rod-chapman It is super cool that you established the equivalence between the slow and fast versions. Overall looks good to me and I just have some high-level comments.

I agree with @weaversa that the code should be in a module when integrated into Cryptol-specs. Even better, it could take use of Cryptol's module system when referencing exiting functions like BitRev7 instead of copying that function into the file.

Another high-level comment, I think it will be easier to read if references or comments are added especially for constant numbers. For example, MLKEM_Q and MLKEM_N, what are they? Is there a reference for base_zeta? Another one is zeta_expc, is there a reference for this lookup table? If it is generated by you, I think it will be great if the function for generating this table is included in the spec.

It is unclear how coerceSize is useful by looking at the code. I agree with @weaversa that a comment is necesary for explaining why.

rod-chapman commented 1 month ago

Thanks for the comments.

  1. Yes - how it gets modularized and integrated with Galois' module structure in the crypto-specs repo is TBD.
  2. The significance of the Q, N, Zeta constants is all in FIPS-203, but waiting for the final version to be published (any day now we hope...)
rod-chapman commented 1 month ago

I have tidied up a bit more - adopted the doc-string format for the property proof commands, and removed redundant parentheses. I await comms from Galois about how they prefer to integrate into cryptol-specs

rod-chapman commented 1 month ago

I have opened a PR to integrate this code into Galois' cryptol-specs repo. Here: https://github.com/GaloisInc/cryptol-specs/pull/95