mit-plv / fiat-crypto

Cryptographic Primitive Code Generation by Fiat
http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP19.pdf
Other
712 stars 147 forks source link

Go code cosmetic fixes #949

Open FiloSottile opened 3 years ago

FiloSottile commented 3 years ago
var x25 uint64
var x26 uint64
x26, x25 = bits.Mul64((arg1[7]), ((arg2[4]) * 0x2))
x26, x25 := bits.Mul64((arg1[7]), ((arg2[4]) * 0x2))
var x196 uint64 = (x193 >> 58)
x196 := x193 >> 58
bits.Mul64((arg1[7]), ((arg2[4]) * 0x2))
bits.Mul64(arg1[7], arg2[4] * 0x2)
func addcarryxU58(out1 *uint64, out2 *uint1, arg1 uint1, arg2 uint64, arg3 uint64) {
  var x1 uint64 = ((uint64(arg1) + arg2) + arg3)
  var x2 uint64 = (x1 & 0x3ffffffffffffff)
  var x3 uint1 = uint1((x1 >> 58))
  *out1 = x2
  *out2 = x3
}

addcarryxU58(&x20, &x21, 0x0, x1, (x19 & 0x3ffffffffffffff))
func addcarryxU58(arg1 uint1, arg2 uint64, arg3 uint64) (out1 uint64, out2 uint64) {
  var x1 uint64 = ((uint64(arg1) + arg2) + arg3)
  var x2 uint64 = (x1 & 0x3ffffffffffffff)
  var x3 uint1 = uint1((x1 >> 58))
  out1 = x2
  out2 = x3
  return
}

x20, x21 := addcarryxU58(0x0, x1, (x19 & 0x3ffffffffffffff))
// TightElement ...
//
// Bounds:
//
//     [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x200000000000000]
type TightElement *[8]uint64
/*
   The function Opp negates a field element.
   Postconditions:
     eval out1 mod m = -eval arg1 mod m
   Input Bounds:
     arg1: [[0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x400000000000000], [0x0 ~> 0x200000000000000]]
   Output Bounds:
     out1: [[0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0xc00000000000000], [0x0 ~> 0x600000000000000]]
 */
/*inline*/
func Opp(out1 *[9]uint64, arg1 *[9]uint64) {
// Opp negates a field element.
//
//    eval out1 mod m = -eval arg1 mod m
//
func Opp(out1 *LooseElement, arg1 *TightElement) {

This should be the first line in the file, which by convention will get recognized by various tooling as autogenerated code and left alone.

// Code generated by fiat-crypto. DO NOT EDIT.
mdempsky commented 3 years ago

I think these would be nice improvements if they're easy and don't add complexity to fiat-crypto's output routines (which aren't verified), but I don't think any of these affect the assembly ultimately generated by the Go compilers?

armfazh commented 3 years ago

@JasonGross definitely types for differentiating between Montgomery domain. +1

mdempsky commented 3 years ago

FWIW, the syntax rules for Go documentation are described at: https://golang.org/pkg/go/doc/#ToHTML

I think the main noteworthy things for fiat-go code are that blank lines separate paragraphs, and runs of indented lines get formatted with \<pre>.

JasonGross commented 3 years ago

I think the main noteworthy things for fiat-go code are that blank lines separate paragraphs, and runs of indented lines get formatted with <pre>.

@mdempsky What's the rule for runs of indented lines not separated from the previous line by a newline. That is, how is something like

line1

line2
  line3
  line4

line5

  line6
  line7
line8

  line9
  line10

Which indented lines become <pre> blocks?

JasonGross commented 3 years ago
  • run gofmt on the output (which will only fix some of the whitespace

This one is a bit tricky logistically, I'd rather not make gofmt a requirement for synthesizing the go files via the Makefile. I can change the output to match the spacing conventions of gofmt, though?

mdempsky commented 3 years ago

Which indented lines become <pre> blocks?

It looks like lines 3 and 4, 6 and 7, and 9 and 10 become pre blocks:

testing with `go doc` ``` mdempsky@mdempsky:/tmp/test$ ls test.go mdempsky@mdempsky:/tmp/test$ cat test.go /* line1 continued line2 continued line3 line4 continued line5 continued line6 line7 line8 continued line9 line10 */ package p mdempsky@mdempsky:/tmp/test$ go doc . package p // import "." line1 continued line2 continued line3 line4 continued line5 continued line6 line7 line8 continued line9 line10 ``` (This is the command-line output, but it uses the same parser. The 4-space indented blocks correspond to HTML `pre` tags; the unindented blocks correspond to HTML `p` tags.)

This one is a bit tricky logistically, I'd rather not make gofmt a requirement for synthesizing the go files via the Makefile.

Out of curiosity, what's the logistic difficulty here? I thought Go is pretty generally available nowadays.

I can change the output to match the spacing conventions of gofmt, though?

gofmt can be somewhat finnicky about things; e.g., it'll format 1 * 2 and 3 + 4, but then 1*2 + 3.

However, to the extent you're able to better match the output, that sounds good and desirable.

JasonGross commented 3 years ago

Out of curiosity, what's the logistic difficulty here? I thought Go is pretty generally available nowadays.

The files are regenerated by the default makefile target. I'd like to be able to say all of the following:

mdempsky commented 3 years ago

Makes sense. So we just need a formally verified implementation of gofmt in Coq? :)

FiloSottile commented 3 years ago
  • run gofmt on the output (which will only fix some of the whitespace

This one is a bit tricky logistically, I'd rather not make gofmt a requirement for synthesizing the go files via the Makefile. I can change the output to match the spacing conventions of gofmt, though?

Sure, you can run gofmt -d to see the changes it would apply to a file.

Yawning commented 3 years ago
*  use `uint64` instead of `uint1` and drop addcarryxU64 and subborrowxU64 wrappers

This is more than cosmetic. Getting rid of the wrappers and assuming that bits.Add64 behaves as advertised ("The carryOut output is guaranteed to be 0 or 1") improves curve25519 CarryMul performance by ~35%[0]. I assume CarrySquare will also see a significant performance gain.

    var x51 uint64
    var x52 uint1
    x51, x52 = addcarryxU64(x13, x7, 0x0)
    var x53 uint64
    x53, _ = addcarryxU64(x14, x8, x52)
    var x51 uint64
    var x52 uint64
    x51, x52 = bits.Add64(x13, x7, 0x0)
    var x53 uint64
    x53, _ = bits.Add64(x14, x8, x52)

While we're wishing for various fixes to the Go output, Selectznz specifies a private type for one of it's arguments (arg1 uint1), severely limiting the usefulness of the routine.

[0]: With go 1.16.5. 1.17beta1 does a better job of optimizing the current code so the gain is "only" ~31%.

JasonGross commented 3 years ago

This is more than cosmetic. Getting rid of the wrappers and assuming that bits.Add64 behaves as advertised ("The carryOut output is guaranteed to be 0 or 1") improves curve25519 CarryMul performance by ~35%[0]. I assume CarrySquare will also see a significant performance gain.

If you typedef uint1 to uint64, does that give you the performance benefit?

If removing the wrappers is really that important, I can add a rewriting pass to relax the bounds for go output.

Yawning commented 3 years ago

If you typedef uint1 to uint64, does that give you the performance benefit?

name \ time/op  baseline     typedef-ed   no-wrapper
CarryMult-8     66.4ns ± 1%  50.8ns ± 1%  45.9ns ± 1%
name \ time/op  baseline-1.17  typedef-ed-1.17  no-wrapper-1.17
CarryMult-8       61.7ns ± 0%      48.8ns ± 0%      44.7ns ± 0%

It is considerably faster, but the wrapper still has overhead, even though it gets inlined, because of what appears to be inline marking.

    0x023d 00573 (curve25519.go:404)    XCHGL   AX, AX
    0x023e 00574 (curve25519.go:406)    XCHGL   AX, AX
    0x023f 00575 (curve25519.go:409)    XCHGL   AX, AX
    0x0240 00576 (curve25519.go:411)    XCHGL   AX, AX
    0x0241 00577 (curve25519.go:413)    MOVQ    $2251799813685247, R9
    0x024b 00587 (curve25519.go:413)    ANDQ    R9, R12
    0x024e 00590 (curve25519.go:416)    XCHGL   AX, AX
    0x024f 00591 (curve25519.go:418)    XCHGL   AX, AX
    0x0250 00592 (curve25519.go:421)    XCHGL   AX, AX
    0x0251 00593 (curve25519.go:423)    XCHGL   AX, AX
    0x0252 00594 (curve25519.go:426)    XCHGL   AX, AX
    0x0253 00595 (curve25519.go:428)    XCHGL   AX, AX
    0x0254 00596 (curve25519.go:431)    XCHGL   AX, AX
    0x0255 00597 (curve25519.go:433)    XCHGL   AX, AX
    0x0256 00598 (curve25519.go:436)    XCHGL   AX, AX
    0x0257 00599 (curve25519.go:438)    XCHGL   AX, AX
    0x0258 00600 (curve25519.go:441)    XCHGL   AX, AX
    0x0259 00601 (curve25519.go:443)    XCHGL   AX, AX
    0x025a 00602 (curve25519.go:446)    XCHGL   AX, AX
    0x025b 00603 (curve25519.go:448)    XCHGL   AX, AX
    0x025c 00604 (curve25519.go:451)    XCHGL   AX, AX
    0x025d 00605 (curve25519.go:453)    XCHGL   AX, AX
    0x025e 00606 (curve25519.go:456)    XCHGL   AX, AX
    0x025f 00607 (curve25519.go:462)    XCHGL   AX, AX
    0x0260 00608 (curve25519.go:468)    XCHGL   AX, AX
    0x0261 00609 (curve25519.go:474)    XCHGL   AX, AX
    0x0262 00610 (curve25519.go:475)    SUBQ    CX, DX
    0x0265 00613 (curve25519.go:476)    MOVQ    DI, CX
    0x0268 00616 (curve25519.go:476)    SHRQ    $51, DI
    0x026c 00620 (curve25519.go:476)    SHLQ    $13, DX

Since I don't see soul crushing NOP-walls of doom in the version of the routine where I manually removed the wrapper, I assume that the compiler doesn't emit a InlMark (or gets rid of it at some point) for direct calls to bits.Add64 because of the special cases in the compiler's SSA code.

At some point (in the far future) the compiler behavior may resolve itself, but at least one related issue has been open since 2019 (https://github.com/golang/go/issues/29571).

mmcloughlin commented 2 years ago
  • use uint64 instead of uint1 and drop addcarryxU64 and subborrowxU64 wrappers

We should either do this or export uint1 as Uint1. At the moment there are exported functions taking arguments of unexported types.

https://github.com/mit-plv/fiat-crypto/blob/c60f9a6f40c7b380a3012ee36805918bfbd7de38/fiat-go/64/curve25519/curve25519.go#L570