bitcoin-core / secp256k1

Optimized C library for EC operations on curve secp256k1
MIT License
2.02k stars 977 forks source link

Add BIP352 `silentpayments` module #1519

Open josibake opened 2 months ago

josibake commented 2 months ago

This PR adds a new Silent Payments (BIP352) module to secp256k1. It is a continuation of the work started in https://github.com/bitcoin-core/secp256k1/pull/1471.

The module implements the full protocol, except for transaction input filtering and silent payment address encoding / decoding as those will be the responsibility of the wallet software. It is organized with functions for sending (prefixed with _sender) and receiving (prefixed by _recipient).

For sending

  1. Collect private keys into two lists: taproot_seckeys and plain_seckeys Two lists are used since the taproot_seckeys may need negation. taproot_seckeys are passed as keypairs to avoid the function needing to compute the public key to determine parity. plain_seckeys are passed as just secret keys
  2. Create the _silentpayment_recipient objects These structs hold the scan and spend public key and an index for remembering the original ordering. It is expected that a caller will start with a list of silent payment addresses (with the desired amounts), convert these into an array of recipients and then match the generated outputs back to the original silent payment addresses. The index is used to return the generated outputs in the original order
  3. Call silentpayments_sender_create_outputs to generate the xonly public keys for the recipients This function can be called with one or more recipients. The same recipient may be repeated to generate multiple outputs for the same recipient

For scanning

  1. Collect the public keys into two lists taproot_pubkeys and plain_pubeys This avoids the caller needing to convert taproot public keys into compressed public keys (and vice versa)
  2. Compute the input data needed, i.e. sum the public keys and compute the input_hash This is done as a separate step to allow the caller to reuse this output if scanning for multiple scan keys. It also allows a caller to use this function for aggregating the transaction inputs and storing them in an index to vend to light clients later (or for faster rescans when recovering a wallet)
  3. Call silentpayments_recipient_scan_outputs to scan the transaction outputs and return the tweak data (and optionally label information) needed for spending later

In addition, a few utility functions for labels are provided for the recipient for creating a label tweak and tweaked spend public key for their address. Finally, two functions are exposed in the API for supporting light clients, _recipient_created_shared_secret and _recipient_create_output_pubkey. These functions enable incremental scanning for scenarios where the caller does not have access to the transaction outputs:

  1. Calculating a shared secret This is done as a separate step to allow the caller to reuse the shared secret result when creating outputs and avoid needing to do a costly ECDH every time they need to check for an additional output
  2. Generate an output (with k = 0)
  3. Check if the output exists in the UTXO set (using their preferred light client protocol)
  4. If the output exists, proceed by generating a new output from the shared secret with k++

See examples/silentpayments.c for a demonstration of how the API is expected to be used.

Note for reviewers

My immediate goal is to get feedback on the API so that I can pull this module into https://github.com/bitcoin/bitcoin/pull/28122 (silent payments in the bitcoin core wallet). That unblocks from finishing the bitcoin core PRs while work continues on this module.

Notable differences between this PR and the previous version

See https://github.com/bitcoin-core/secp256k1/issues/1427 and https://github.com/bitcoin-core/secp256k1/pull/1471 for discussions on the API design. This iteration of the module attempts to be much more high level and incorporate the feedback from #1471. I also added a secp256k1_silentpayments_public_data opaque data type, which contains the summed public key and the input_hash. My motivation here was:

  1. I caught myself mixing up the order of arguments between A_sum and recipient_spend_key, which was impossible to catch with ARG_CHECKS and would result in the scanning process finishing without errors, but not finding any outputs
  2. Combining public key and input_hash into the same data type allows for completely hiding input_hash from the caller, which makes for an overall simpler API IMO

I also removed the need for the recipient to generate a shared secret before using the secp256k1_silentpayments_recipient_scan_outputs function and instead create the shared secret inside the function.

Outstanding work

josibake commented 2 months ago

Rebased on #1518 (https://github.com/bitcoin-core/secp256k1/commit/3d080277895655e8274ee73aacd154c4ead143e3 -> https://github.com/bitcoin-core/secp256k1/commit/8b48bf19c3c020e653734f6c9d9364e6a47a30d1, compare)

josibake commented 2 months ago

Updated https://github.com/bitcoin-core/secp256k1/commit/8b48bf19c3c020e653734f6c9d9364e6a47a30d1 -> https://github.com/bitcoin-core/secp256k1/commit/f5585d4b93606144e76e45ad3d43a797a9afefcf (bip352-silentpayments-module-rebase -> bip352-silentpayments-module-02, compare):

For the label scanning, I looked for an example of using an invalid public key but didn't see anything except for the invalid_pubkey_bytes in the tests. For now, if the output is found without a label, I'm setting found_with_label = 0 and saving the found output in both the output and label field. Happy to change this if there is a better suggestion for communicating an invalid public key.

I also used secp256k1_pubkey_save instead of output = *tx_outputs, as I think this makes the code more clear.

josibake commented 2 months ago

Thanks for the thorough review, @theStack ! I've addressed your feedback, along with some other changes.


Update https://github.com/bitcoin-core/secp256k1/commit/f5585d4b93606144e76e45ad3d43a797a9afefcf -> https://github.com/bitcoin-core/secp256k1/commit/1a3a00bd0999a89e30d5dc9f927592ead72ab7a3 (bip352-silentpayments-module-02 -> bip352-silentpayments-module-03, compare)

The sending tests now check that the generated outputs match exactly one of the possible expected output sets. Previously, the sending tests were checking that the generated outputs exist in the array of all possible outputs, but this wouldn't catch a bug where k is not being set correctly e.g. [Ak=0, Bk=0] would (incorrectly) pass [Ak=0, Bk=1, Ak=1, Bk=0] but will now (correctly) fail [[Ak=0, Bk=1], [Ak=1, Bk=0]]

josibake commented 2 months ago

Rebased on #1518 https://github.com/bitcoin-core/secp256k1/commit/1a3a00bd0999a89e30d5dc9f927592ead72ab7a3 -> https://github.com/bitcoin-core/secp256k1/commit/92f592023f3f4d6a66724772349fbdc4967ab50f (bip352-silentpayments-module-03 -> bip352-silentpayments-module-03-rebase, compare)

josibake commented 2 months ago

Rebased on master (following #1518 merge) https://github.com/bitcoin-core/secp256k1/commit/92f592023f3f4d6a66724772349fbdc4967ab50f -> https://github.com/bitcoin-core/secp256k1/commit/56ed901b6a2239e8b44a1a3f084bd38b9e86d769 (bip352-silentpayments-module-03-rebase -> bip352-silentpayments-module-04-rebase, compare)

josibake commented 1 month ago

Rebased on master to fix merge conflict https://github.com/bitcoin-core/secp256k1/commit/56ed901b6a2239e8b44a1a3f084bd38b9e86d769 -> https://github.com/bitcoin-core/secp256k1/commit/bd66eaa22acf434f0134d7f93c4fb694303708c3 (bip352-silentpayments-module-04-rebase -> bip352-silentpayments-module-05-rebase, compare)

josibake commented 1 month ago

CI failure seems related to not being able to install valgrind via homebrew and unrelated to my change so ignoring for now (cc @real-or-random for confirmation?).

josibake commented 1 month ago

Thanks for the review @theStack ! Sorry for the slow response, I somehow missed the notification for your review :sweat_smile:


Update https://github.com/bitcoin-core/secp256k1/commit/bd66eaa22acf434f0134d7f93c4fb694303708c3 -> https://github.com/bitcoin-core/secp256k1/commit/2dde8f1fa13687d2bd8328f85ac412a4052b040c (bip352-silentpayments-module-05-rebase -> bip352-silentpayments-module-06, compare)

Per https://github.com/bitcoin-core/secp256k1/pull/1519#discussion_r1617720108, I agree returning 0 is not the right thing to do, but having multiple error codes also seemed gross. I think an ARG_CHECK makes sense here because if the caller passed all valid seckeys / pubkeys and then they sum to zero, in principle its the caller passing incorrect arguments. The only thing the caller can do at this point is try again with different arguments. For the sender, this would mean repeating coin selection to get a different input set, and for the recipient this would mean skipping the transaction and moving on to the next one. Also happy to change if there is a better suggestion!

real-or-random commented 1 month ago

CI failure seems related to not being able to install valgrind via homebrew and unrelated to my change so ignoring for now (cc @real-or-random for confirmation?).

Indeed, see https://github.com/bitcoin-core/secp256k1/issues/1536

real-or-random commented 1 month ago

Some general notes

On error handling in general

Error handling is hard, and the caller usually can't really recover from an error anyway. This is in particular true on malicious inputs: there's no reason to try to continue dealing with the attacker, and you simply want to abort. That's why, as a general rule, we try to avoid error paths as much as possible. This usually boils down to merging all errors into a single one, i.e., a) have just a single error "code" for all possible errors, b) and in the case of a multi-stage thing involving multiple function calls, have just a single place where errors are returned.

Signature verification is a good example. A (signature, message, pubkey) triple is either valid or not. The caller should not care why exactly a signature fails to verify, so we don't even want to expose this to the caller.

However, signature verification this is also a nice example of a case in which we stretch the rules a bit. Signature verification is implemented as two-stage process: 1. Parse the public key (which can fail). 2. Check the signature (which can fail). Purely from a "safe" API point of view, this is not great because we give the user two functions and two error paths instead of one. Ideally, there could just be one verification function which also takes care of parsing (this is how it's defined BIP340). The primary reason why we want to have a separate parsing function in this case is performance: if you check several signatures under the same key, you don't want to parse, which involves computing the y-coordinate, every time.

ARG_CHECK

ARG_CHECK will call the "illegal (argument) callback", which, by default, crashes. See the docs here: https://github.com/bitcoin-core/secp256k1/blob/1791f6fce4d4856a4ce2b1982768a4ffa23fcc0a/include/secp256k1.h#L324 The callback/crash indicates to the caller that there's a bug in the caller's code.

What does this mean for this discussion?

  • Added ARG_CHECKs to check for the sum of the private keys / public keys being zero

Per #1519 (comment), I agree returning 0 is not the right thing to do, but having multiple error codes also seemed gross. I think an ARG_CHECK makes sense here because if the caller passed all valid seckeys / pubkeys and then they sum to zero, in principle its the caller passing incorrect arguments. The only thing the caller can do at this point is try again with different arguments. For the sender, this would mean repeating coin selection to get a different input set, and for the recipient this would mean skipping the transaction and moving on to the next one. Also happy to change if there is a better suggestion!

So let's take a look at the two sides:

On the sender side: The secret keys sum up to zero (sum_i a_i = 0)

This will happen only with negligible probability for honestly generated (=random) secret keys. That is, this will in practice only happen if the caller has a bug, or the caller has been tricked into using these secret keys, e.g., if someone else has crafted malicious secret keys for the caller. Since the latter is not a crazy scenario, we should not use ARG_CHECK here.

We can just return 0 here to indicate to the caller that we can't continue with these function inputs. And even if there are other error cases, I don't see a reason why the caller code should care much about why the function fails. As long as you call the function with honestly generated inputs, everything will work out. (Devs will be interested in the exact failure case when debugging the caller's code, but I think they can figure out during debugging then. "Normal" caller code should get just a single error code.)

On the recipient side: The public keys sum up to infinity (sum_i A_i = 0) [1]

Again, this can only happen if the sender is malicious. But since we're not the sender, it's entirely possible that the sender is malicious. And then these inputs are certainly legal, they're just not valid. (In the same sense as it's perfectly legal to use the signature verification algorithm on an invalid signature.) So an ARG_CHECK will not be appropriate at all: a malicious sender could trigger it and crash the scanning process.

We should also simply return 0 to indicate that this transaction is not well-formed/not eligible for SP. And again, even if there are other error cases, I don't see a reason why the caller should care why this transaction is not eligible.

Alternatively, we could even return 1, store infinity in the public_data, and simply make sure that scanning won't find any payments in that case. This would avoid the error path for this function entirely. But if the caller then calls secp256k1_silentpayments_recipient_create_shared_secret, I think we'd just postpone the error to this function, and for this function, I don't see another way than returning an error. So I'm not convinced that this is better.

[1] We should perhaps rename "infinity" to "zero"... ;)

josibake commented 1 month ago

@real-or-random thanks for the response, this is super helpful.

Devs will be interested in the exact failure case when debugging the caller's code, but I think they can figure out during debugging then

In hindsight, I think my preference for ARG_CHECK was "better error messages as to what went wrong," but I now realize it was because I was thinking as a dev ;). Also an oversight on my part: I didn't realize/forgot that ARG_CHECK is actually crashing the program by default. I certainly agree that we don't want this in either failure case.

Alternatively, we could even return 1, store infinity in the public_data, and simply make sure that scanning won't find any payments in that case. This would avoid the error path for this function entirely. But if the caller then calls secp256k1_silentpayments_recipient_create_shared_secret, I think we'd just postpone the error to this function, and for this function, I don't see another way than returning an error. So I'm not convinced that this is better.

If we imagine an index + light client scenario, the public_data would be created by the index and then sent to the light client, where the light client would call secp256k1_silentpayments_recipient_create_shared_secret (and then get the error). Given this, I think it would be better to have the error path so that the index ends up not storing any data at all for the malicious crafted transaction, which saves space for the index and bandwidth for the light client.


Thinking about this a bit more:

That's why, as a general rule, we try to avoid error paths as much as possible. This usually boils down to merging all errors into a single one, i.e., a) have just a single error "code" for all possible errors, b) and in the case of a multi-stage thing involving multiple function calls, have just a single place where errors are returned.

Most of the high-level functions in our API are calling multiple lower-level functions and so far the approach has been something like:

if (!secp256k1_func_that_returns_0_on_error(args)) {
    return 0;
}
...
if (!secp256k1_another_func_that_returns_0_on_error(args)) {
    return 0;
}

Perhaps its worth looking to consolidate and try and only return an error at the end of a multi-stage process? This would mean ignoring the return values for a lot of the lower level function calls, which initially made me feel a bit weird. But in light of your recent comment, feels like this might be the preferred approach?

EDIT: reading your comment again, I realize "error paths" is not really talking about branches in the code and more error paths for the user.

theStack commented 1 month ago

We should also simply return 0 to indicate that this transaction is not well-formed/not eligible for SP. And again, even if there are other error cases, I don't see a reason why the caller should care why this transaction is not eligible.

Makes sense. My worry was that without an explicit error-code for this corner case, some users wouldn't even be aware of an indirect "not eligible" case and more likely interpret a return value of 0 as "only possible if there's a logic error on our side, so let's assert for success" (given the passed in data is public and already verified for consensus-validity). But in the end that's more a matter of good API documentation I guess.

An example for the "input public keys sum up to point of infinity" case ($\sum_i A_i = 0$) is now available on the Signet chain via tx d73f4a19f3973e90af6df62e735bb7b31f3d5ab8e7e26e7950651b436d093313 [1], mined in block 198023. It consists of two inputs spending P2WPKH prevouts with negated pubkeys $(x,y)$ and $(x,-y)$ (easy to verify by looking at the second item of the witness stack each, where only the first byte for encoding the sign bit differs), and one dummy P2TR output. It hopefully helps SP implementations to identify potential problems with this corner case early. As first example and proof that it triggers the discussed code path, it makes the Silent Payment Index PR #28241 crash, which asserts on a return value of 1 for _recipient_public_data_create.

I think it would be also a good idea to add this scenario to the BIP352 test vectors, or at least a unit test in this PR?

[1] created with the following Python script: https://github.com/theStack/bitcoin/blob/202405-contrib-bip352_input_pubkeys_cancelled/contrib/silentpayments/submit_input_pubkeys_infinity_tx.py

real-or-random commented 1 month ago

~Perhaps its worth looking to consolidate and try and only return an error at the end of a multi-stage process? This would mean ignoring the return values for a lot of the lower level function calls, which initially made me feel a bit weird. But in light of your recent comment, feels like this might be the preferred approach?~

EDIT: reading your comment again, I realize "error paths" is not really talking about branches in the code and more error paths for the user.

Right, it's about avoiding errors that the user would need to deal with, e.g., by more branches on the user side. So the idea is that to try to avoid complexity in the user code, perhaps at the cost of adding complexity to our code. But let me also add that this and most all of what I above is more a rule of thumb than a strict policy.

We should also simply return 0 to indicate that this transaction is not well-formed/not eligible for SP. And again, even if there are other error cases, I don't see a reason why the caller should care why this transaction is not eligible.

Makes sense. My worry was that without an explicit error-code for this corner case, some users wouldn't even be aware of an indirect "not eligible" case and more likely interpret a return value of 0 as "only possible if there's a logic error on our side, so let's assert for success" (given the passed in data is public and already verified for consensus-validity). But in the end that's more a matter of good API documentation I guess.

Okay, I see the concern. And I agree, this should probably be a matter of documentation. I think, assuming our code is correct, there won't be a reason to return 0 unless there's something wrong with the transaction (invalid public key, keys sum up to 0), and we should just list these reasons in the docs.

I think it would be also a good idea to add this scenario to the BIP352 test vectors, or at least a unit test in this PR?

Indeed, this should be in the BIP, ideally even in the pseudocode. If our code starts to reject "payments", then better be authorized by the standard. And I believe it's the right approach: There's no need to add complexity to implementations to deal with these malicious cases.[^1]

[^1]: One thing to keep in mind are multi-sender scenarios where blaming the malicious participant could be hard. Say honest A and B have pkA and pkB, and malicious C then claims to have pkC = -(pkA + pkC) s.t. pkA + pkB + pkC = 0. And then A and B can't conclude that C is malicious. We run into such a thing in the MuSig BIP, see the first paragraph of https://github.com/bitcoin/bips/blob/master/bip-0327.mediawiki#dealing-with-infinity-in-nonce-aggregation. But IIUC this cannot happen here since C needs to show a signature under pkC.

josibake commented 1 week ago

Update https://github.com/bitcoin-core/secp256k1/commit/2dde8f1fa13687d2bd8328f85ac412a4052b040c -> https://github.com/bitcoin-core/secp256k1/commit/ac591050262d9d00b629943d598f62b47e1ca7ae (bip352-silentpayments-module-06 -> bip352-silentpayments-module-07, compare)

josibake commented 1 week ago

Update https://github.com/bitcoin-core/secp256k1/commit/ac591050262d9d00b629943d598f62b47e1ca7ae -> https://github.com/bitcoin-core/secp256k1/commit/5dd552ccbb5243047e0ad967561796f15a42bfbb (bip352-silentpayments-module-07 -> bip352-silentpayments-module-08, compare)

(note: I also included a rebase inadvertently, so I rebased the -07-rebase branch as well to make comparing the diff easier with this compare link)

The main change in this push is adding full test coverage for the module API. To make this easier, I ended up re-organizing the commits so that everything for sending is one commit, everything for labelled address is one commit, and everything for receiving is one commit. While this does make each commit larger, I think overall this makes things easier to review. I find it especially helpful to have the API tests in the same commit as it makes it easier to reason on how the functions will be used and makes it easy to modify the tests during testing to exercise the API.

While adding the tests, I made the following implementation changes:

Running gcov after these changes shows that the test coverage for the module is ~100% :smile: image

josibake commented 1 week ago

Not quite sure what is happening with the CI failures, since it looks like the all of the tests and examples are passing. Will investigate more and push a fix.

josibake commented 1 week ago

Update https://github.com/bitcoin-core/secp256k1/commit/5dd552ccbb5243047e0ad967561796f15a42bfbb -> https://github.com/bitcoin-core/secp256k1/commit/1136e0c6aa6884d67d984d62f480986b9824db99

CI was failing due to the benchmark executable. In the previous push, I added an ARG_CHECK that if a label_lookup callback is passed, label_cache cannot be NULL. However, I didn't run the benchmarks locally and missed that the benchmark was calling _scan_outputs with a label lookup callback but NULL labels cache. Fixed by passing a noop labels cache pointer. This is fine since the purpose of using the label_lookup in the benchmark isn't to actually scan for labels but to make sure that the label lookup branch of the code gets triggered during the benchmark.

jlest01 commented 5 days ago

Looks good to me overall.

Why does secp256k1_silentpayments_sender_create_outputs have const secp256k1_silentpayments_recipient **recipients as parameter instead of const secp256k1_silentpayments_recipient *recipients ? Is there a specific reason for this ? const secp256k1_silentpayments_recipient **recipients can be somewhat confusing. Why not have a pointer to the first element of an array of secp256k1_silentpayments_recipient structures ?

josibake commented 2 days ago

Update https://github.com/bitcoin-core/secp256k1/commit/1136e0c6aa6884d67d984d62f480986b9824db99 -> https://github.com/bitcoin-core/secp256k1/commit/8ce68efb9511124877d50e40bcea1563e1384ef8 (bip352-silentpayments-module-08 -> bip352-silentpayments-module-09, compare)

Primarily:

I noticed while clearing the secrets that we were creating the shared secrets in a really inefficient way:

Instead, I opted to remove secp256k1_ecdh and just calculate the the shared secret directly with ecmult_const. This avoids a lot of back and forth with serializing and deserializing and allowed me to remove the rather complicated noop hash function. Since secp256k1_ecdh is calling ecmult_const under the hood, this should be the same. However, I did notice that _cmov is being called inside secp256k1_ecdh, which led me down a bit of a rabbit hole. I left some TODO comments in places where we deserialize the secret keys into scalars for future reviewers, because I'm not quite sure what the right approach here is (cc the friendly neighborhood cryptographers @real-or-random , @jonasnick ).

I also refactored the create_shared_secret function to avoid needing to create an intermediate tweaked scalar, which also improves the readability of the code imo.

josibake commented 2 days ago

Update https://github.com/bitcoin-core/secp256k1/commit/8ce68efb9511124877d50e40bcea1563e1384ef8 -> https://github.com/bitcoin-core/secp256k1/commit/a31a105e6dd9446be7694226fac0e8e7bfafe300 (bip352-silentpayments-module-09 -> bip352-silentpayments-module-10, compare)

josibake commented 2 days ago

Why does secp256k1_silentpayments_sender_create_outputs have const secp256k1_silentpayments_recipient **recipients as parameter instead of const secp256k1_silentpayments_recipient *recipients ? Is there a specific reason for this ?

Thanks for the review, @jlest01 ! For the _recipient structs specifically, a pointer to an array of pointers is preferable for sorting the recipients in place. With an array of pointers each entry is 8 bytes vs with recipient structs each entry is ~136 bytes, making it more efficient to move the pointers around.

In general, pointers to an array of pointers is preferred as it provides more flexibility for the caller. For example, if the caller is using n inputs that all have the same public key, they can initialize the secp256k1_pubkey object once and then create a pointer to it n times. This is more efficient than requiring them to initialize n structs for the same public key.

jlest01 commented 2 days ago

Thanks for the clarification @josibake . Just out of curiosity. Is there a PR in bitcoin core that uses the interface being implemented here ? I would like to better understand how this interface is intended to interact with Bitcoin Core classes.

josibake commented 1 day ago

Thanks for the review, @Sjors !

Maybe move the changes in src/modules/silentpayments/tests_impl.h to their own commit?

This is how it was structured before, but I found it much easier to work with having the tests in the same commit where functions are added to the API. This allows you to easily make changes to the API and the relevant tests in the same commit and also lets reviews step through each commit and verify that the commit compiles and passes the tests.

Will digest your feedback on the examples/silentpayments.c and update!


Just out of curiosity. Is there a PR in bitcoin core that uses the interface being implemented here ?

@jlest01 all of the Bitcoin Core PRs can be found here: https://github.com/bitcoin/bitcoin/issues/28536