Trusted Setup Interchange Format feedback

Feedback on the proposal: https://github.com/mratsim/constantine/blob/f57d071/constantine/trusted_setups/README.md

https://discord.com/channels/595666850260713488/694537838691352576/1140659053035540540

from @jtraglia

jtraglia — 08/14/2023 4:51 PM
I'm still reviewing, but here are some initial remarks. Posting individually in case you want to respond to a specific one.
It would be nice if the version field was terminated with a NUL byte so it could be treated as a string. Maybe steal some bytes from the protocol field & make the version field 8 bytes. This would also allow more version variations, such as "v1.2.3" (and a NUL byte).
Througout the spec, integers should be explicity specified as unsigned. The sentence with "[0, n)" at the top of the metadata section isn't enough IMO.
The "fields" field should probably be renamed to "fields count"/"num fields"/etc for clarity.
Instances of "see dedicated section" should be more specific, point to the exact section.
I have some minor nits about consistent capitalization of field names & descriptions. Can elaborate later.
There's a missing period in "NUL bytes (0x00) Data" sentence, between ")" and "D".
I agree that things should be represented as little-endian, but I do have some concerns. Can discuss later.
Oh, also the formatting of the list in the "Quick algebra refresher" section is a little off.

jtraglia — 08/14/2023 9:51 PM
After implementing this for c-kzg-4844, I have a few more remarks.
I didn't pick up that the g1 & g2 points were affines. I thought they were blst_p1 & blst_p2. 
I think it would be more natural to have the schema above each data, instead of grouped together.
So instead of {header, schema, schema, schema, data, data, data} it would be {header, schema, data, schema, data, schema, data}.

jtraglia — 08/14/2023 9:59 PM
And yes, it is quite fast. For me it took ~400µs to load the trusted setup.
/cc @asn (George Kadianakis) too.

jtraglia — 08/14/2023 11:07 PM
Also, the example for metadata->version is misleading. Not a bad idea using u8's there.

From @kevaundray

Did a quick skim so can look again after these initial remarks 🙂 

Montgomery representation

I don't think we should expose montgomery representation because to me this is an un-necessary abstraction leak*. This was the case in gnark, for example, where it was exposed for performance reasons. Though they have since gotten rid of exposing montgomery in the API: https://github.com/Consensys/gnark-crypto/issues/276

The benefits of exposing montgomery form is performance, though if you don't expose montgomery form, then you no longer need to speak about whether its 32/64 bit limbs and you also don't need to distinguish between possible primes which do not use montgomery form because all of that would be an implementation detail behind the abstraction.

Standard serialization

A carry-on point from the previous one, most elliptic curves have a "standard" serialization strategy, do we have reason to not use it?

Padding

I don't have a strong opinion about this, just wondering if we should be padding the trusted setup for possible SIMD benefits without benchmarking?

General beliefs and assumptions

I'm under the assumption that the trusted setup will only be loaded upon startup, so once you get within a certain loading time, it only makes sense to apply optimizations as long as they do not leak abstractions. Perhaps there's reason to say that we have not achieved that loading time?

mratsim / constantine

Trusted Setup Interchange Format feedback #256