privacy-scaling-explorations / zk-kit

A monorepo of reusable libraries for zero-knowledge technologies.
https://zkkit.pse.dev
MIT License
281 stars 68 forks source link

Add `be` (Big Endian) prefix to the relevant conversion functions in the `utils` package #153

Closed cedoor closed 6 months ago

cedoor commented 7 months ago

Describe the improvement you're thinking about

It might be worth considering adding a be prefix to the relevant conversion functions in the @zk-kit/utils package, to make it more consistent and clear they're always picking an order, as there isn't really a safe "default option".

cedoor commented 7 months ago

@artwyman I was thinking about this again. Those functions will actually be used outside zk-kit as well. From what I have seen, most functions that convert numbers assume BE as the default representation. I wonder if adding the "be" prefix can make them confusing for people who are not used to caring about representations.

artwyman commented 7 months ago

Those functions will actually be used outside zk-kit as well.

Will be used in future, or are used today? I want to clarify if you're concerned about breaking existing code, or simply about clarity for future users.

From what I have seen, most functions that convert numbers assume BE as the default representation. I wonder if adding the "be" prefix can make them confusing for people who are not used to caring about representations.

There isn't really a global default endianness. It depends on the platform and the usage. E.g. IP specifies big-endian (sometimes referred to as "network byte order") while most hardware platforms (e.g. x86/x64) are little-endian. In most situations, programmers have fixed-width numbers (e.g. 32-bit or 64-bit) and pass them to a library which knows the right encoding, so they don't have to think about it. My thought for zk-kit would be to encourage most interfaces to be based on bigints (or stringified bigints) so that users don't have to think about byte order. That seems to be already where most of your interfaces are leaning.

That being said, it's fine for an individual library to have a default choice on byte-order. The question is where are people using it, and is it beneficial for them to pick a safe default, or is it something they need to think about. Giving both functions names which specify a byte order was my suggestion leaning in the direction of "you need to think about this, so let's make it explicit".

A potential compromise, particularly if you're worried about existing users, would be to provide all three options, and have bufferToBigint be an alias for beBufferToBigint. A less-invasive compromise might be to add more explicit documentation so that it's at least clear which byte order is being used as default.

cedoor commented 7 months ago

Will be used in future, or are used today? I want to clarify if you're concerned about breaking existing code, or simply about clarity for future users.

They will be used in the future. I'm mainly concerned about devs who are used to libraries like ethers.js or web3.js, which should also have similar functions to convert from hex to bigint etc, but they don't use any prefix. They definitely mention the representation they're using in the comments about the functions though. Ex. https://github.com/ethers-io/ethers.js/blob/ad5f1c5fc7b73cfb20d9012b669e9dcb9122d244/src.ts/utils/maths.ts#L134

There isn't really a global default endianness.

Thanks for this context. I don't know much about this stuff. I've always assumed that there was in fact a de-facto standard on BE.

My thought for zk-kit would be to encourage most interfaces to be based on bigints (or stringified bigints) so that users don't have to think about byte order. That seems to be already where most of your interfaces are leaning.

I totally agree with this. I think it is also time to change the target of the JS bundles to ES6. ES6 also supports native bigints syntax: 8n (instead of BigInt(8)). It makes sense for it to be string in some cases since JSONs do not support native bigints.

"You need to think about this, so let's make it explicit".

I get it! This might make sense for libraries related to baby jubjub, not sure about the others.

A potential compromise, particularly if you're worried about existing users, would be to provide all three options, and have bufferToBigint be an alias for beBufferToBigint. A less-invasive compromise might be to add more explicit documentation so that it's at least clear which byte order is being used as default.

I like this compromise. I'd export three options and also add documentation to explain when a byte order is used as the default one when there's no explicit prefix.

artwyman commented 7 months ago

There isn't really a global default endianness.

Thanks for this context. I don't know much about this stuff. I've always assumed that there was in fact a de-facto standard on BE.

There being no global default doesn't mean there aren't areas which picked a default. It's possible that Etherium has driven a default choice of BE for Web3 stuff. Most of my experience dealing with byte-order is from years ago, so I also can't speak to how the landscape may've changed. Long story short, I don't mind if you do whatever you think is best here. I do think that whatever you do should be clearly documented.

I totally agree with this. I think it is also time to change the target of the JS bundles to ES6. ES6 also supports native bigints syntax: 8n (instead of BigInt(8)). It makes sense for it to be string in some cases since JSONs do not support native bigints.

We recently changed the Zupass repo over to ES2020 for the same reason, since I was annoyed by the lack of bigint literals while working on the same recent experiments which raised this issue.

I like this compromise. I'd export three options and also add documentation to explain when a byte order is used as the default one when there's no explicit prefix.

I tend to be a fan of explicitness, and of giving developers options.