zcash / zips

Zcash Improvement Proposals
https://zips.z.cash
MIT License
269 stars 152 forks source link

[ZIP 231] Decouple memos from transaction outputs #627

Open nuttycom opened 2 years ago

nuttycom commented 2 years ago

This is an idea that was discussed in the process of Orchard protocol development, but was rejected at the time as being too significant a difference relative to Sapling. I thought we'd opened an issue somewhere to track it, but now have been unable to find that issue.

The idea is this: instead of storing a full 512-byte memo with each Orchard action, instead store a 32-byte symmetric key, and then in a separate part of the transaction format, store 0..n 512-byte memo chunks where k of those memos may be decrypted by a particular action's decryption key.

An alternative proposal put forth by @ebfull was to always include exactly one memo per transaction, instead of one memo per output, so as to leak less information. However, this severe of a restriction could close off important use cases, such as chaining multiple memos to attach longer messages (as can already be done using multiple outputs within a single transaction.)

The number of memo chunks attached to a transaction could also be taken into account in fee computations.

nathan-at-least commented 2 years ago

The big trade-off here is between efficiency and privacy.

With orchard actions, we have "this thing might have any combination of an input, an output, and/or a memo" which provides some metadata privacy. So a question is "how much benefit, and is it worth it?"

It's easiest for me to ponder the extreme case where txns can have 0 memo space, and people have sometimes advocated for that since before launch. In that case it seems quite likely users or wallet devs may prefer for "purely financial transfers" to have no memo. If that happens, then every usage of memos would become a distinguishing factor.

Now the current design might some but very likely not all usage distinction. The max(in, out, memo) arity is visible in Orchard, and let's imagine there's reason to infer from out-of-band hints that the only use case that has a high max(in, out, memo) is some new memo-based app that chains many memos together in a transaction. Now the usage of that app can be distinguished. But then let's say a new distinct use case emerges that involves big fan-out payments. Now it's more difficult to distinguish between these two use cases (assuming they have similar arity distributions), so privacy is improved a bit. If we unlink the in, out, and memo arities, then those two distinct use cases would be distinguishable, which seems bad.

Again, all of this is trading off against efficiency. If we didn't care about efficiency we could go the opposite direction, say every transaction is a gigabyte and may contain up to millions of inputs or outputs or memo data, and no one could distinguish the usage based on arity. Obviously not practical (yet ;-) ).

I'd like to know more about this kind of metadata analysis before changing this knob.

nathan-at-least commented 2 years ago

BTW, I just filed a very general ticket #629 partially reminded of it by this conversation. If we could find a more scalable architecture, it may remove a lot of the efficiency constraints of memos in the first place. (This would be a much broader bigger architectural change compared to this ticket which is a much more tightly scoped trade-off change.)

str4d commented 8 months ago

Out current CompactBlock approach is "pretend the AEAD is just a stream cipher and only decrypt the first X bytes", to avoid downloading the memos during scanning. However, that means we have to enhance every transaction to figure out whether or not it has a memo. We could instead define the all-zeroes memo key as "no memo key", so clients immediately know that the output has no associated memo. Then we'd fetch the entire note ciphertext including tag during scanning (and omit the separate memo data).

nuttycom commented 8 months ago

One important use case for memo extension is the ability to provide signed reply-to addresses; in the case of a signature approach based on ZIP 304 extended to Orchard, the required signature cannot fit within a single 512-byte memo field.

str4d commented 8 months ago

@nuttycom and I discussed this a few weeks back, to figure out some design requirements.

If we consider having a Vec<MemoSegment> field in the v6 transaction, then when a recipient has a memo key from an output, they will be able to decrypt k out of n segments. The segments are directly concatenated in the order in which they appeared in the Vec, and then exposed to the recipient as a single variable-length MemoBytes.

str4d commented 8 months ago

For the encryption scheme, I think something like STREAM (from Online Authenticated-Encryption and its Nonce-Reuse Misuse-Resistance) would be suitable here. To decrypt a memo, the recipient would initialize a counter to zero, and then attempt to decrypt the memo segments in order with their memo key, and nonce set to the counter. Each time they successfully decrypt a memo segment, they increment the counter and continue. We don't need to have a "last chunk" flag because we would not have any short segments (unlike e.g. age), and we don't have length extension or truncation concerns because transactions are fully committed to in the chain and signed with spendAuthSig.

daira commented 8 months ago

I'm provisionally assigning number ZIP 231 for this.

There are two arguments for assigning a 2xx number:

ZIP 231 is the first unassigned number after ZIP 230 for v6 transactions (#686, #687). That makes sense since these ZIPs will in practice usually need to be read together in that order, which is consistent with the following in ZIP 0:

Try to number ZIPs that should or will be deployed together consecutively (subject to the above conventions), and in a coherent reading order.

daira commented 8 months ago

Penumbra made a similar change to use memo keys: issue https://github.com/penumbra-zone/penumbra/issues/1222, PR https://github.com/penumbra-zone/penumbra/pull/1371.

daira commented 3 months ago

TODO: look at what Penumbra are doing.

str4d commented 3 months ago

Comments from Arborist call today:

nuttycom commented 3 months ago

With respect to a memo pruning height, it's important to consider incentives. What incentive does a user have to request that their memo be optionally prunable? It's basically the user saying that they'd like to help out node operators by reducing the burden of storing their memos.

One possibility here (that I have not analyzed in depth) is that the memo fee could be a function of the length of its retention, with a fixed set of possible retention lengths to choose from - maybe retention lengths could be Fibonacci numbers or something of the sort? Then, the conventional fee would be a function of size and rentention length, and indefinite retention wouldn't even be an option.

str4d commented 3 months ago

One possibility here (that I have not analyzed in depth) is that the memo fee could be a function of the length of its retention

We previously discussed similar ideas in the context of "older notes pay more fee" (effectively a storage fee for value pools) in https://github.com/zcash/zcash/issues/3693. We closed that as obsoleted by ZIP 317, but the conversations there would be relevant if we were tying memo retention to fees.

nuttycom commented 3 months ago

It doesn't seem to me like zcash/zcash#3693 is fully obsoleted by ZIP 317; it's orthogonal so I've reopened it.

str4d commented 1 month ago

In the R&D meeting today we determined that this definitely won't be ready for NU6.