bitcoinjs / bitcoinjs-lib

A javascript Bitcoin library for node.js and browsers.
MIT License
5.72k stars 2.11k forks source link

Allowing witnessUtxo and nonWitnessUtxo when creating PSBTs #1894

Open bucko13 opened 1 year ago

bucko13 commented 1 year ago

This is a revisit of an issue raised a couple years ago in #1595 however it's now also causing an issue in signing for the new ledger bitcoin app with nested segwit inputs.

While it was noted in the previous ticket that the witnessUtxo can be derived from the nonWitnessUtxo and therefore both are technically unnecessary, this error Error: Can not add duplicate data to input seems to be a violation of the simple signer algorithm from BIP174. Unfortunately, the pseudocode in the BIP itself might be wrong based on this conversation in an issue in Electrum on the same subject.

The conclusion of that issue based on feedback from achow101 is that witnessUtxo must be present for segwitv0 or above and that the presence or lack thereof of nonWitnessUtxo data determines which signing algorithm to use. For a nested segwit transaction, since bitcoinjs-lib doesn't allow both, if you send a PSBT with nonWitnessUtxo only, then the non-segwit signature algorithm is chosen which produces incorrect signatures for nested segwit.

junderw commented 1 year ago

This change would require modifying the bip174 repo, bumping its version here, and making sure the usage in this repo is sound based on the new drastic changes.

I will definitely review any PRs to both repos, but to be honest I don't have much time to spend fixing an issue that is caused by ambiguities in the BIP and how each vendor interprets it. If you or someone else would like to do it, I'll consider it. The code changes will probably be small, but the implications (ie. amount of code in other places that assume only one or the other can exist) are probably much bigger.


Also as a side note, sample code in a BIP is not where you specify protocols. If that were true, every language would be beholden to the super minute details of how python's built-in str() function etc. works.

(ie. when making the electrum-mnemonic repo, I had to re-create Python's implementation of unicode normalization and CJK detection in raw JS... since it was slightly differing from Python. Which I only did because electrum's mnemonic scheme is only a Python implementation with no spec. I was literally just trying to implement a scheme that only exists as a Python implementation)

bucko13 commented 1 year ago

Also as a side note, sample code in a BIP is not where you specify protocols. If that were true, every language would be beholden to the super minute details of how python's built-in str() function etc. works.

Yeah, I don't think it's meant as a way to say "implement this", nor is it sample code, but rather its pseudo code of the signing algorithm that lays out the logic for deciding if/how something is to be signed, which the BIP seems like a good place for. I.e. "if data x is present, sign with algo x'. else if data y is present sign with algo y'".

FWIW, this isn't about how the vendors are interpreting it but as you can see in the link I sent is actually also how the BIP's author defined it when asked about this specific issue. That said, the test vectors do support the case where just witnessUtxo is present. The rest does seem to be an area of unfortunate ambiguity.

making sure the usage in this repo is sound based on the new drastic changes.

Do you have a sense of where changes of this nature might end up having an adverse impact? Based on the error bitcoins currently throws it sounds like it's just an issue of having unnecessary extra data serialized. Any software signers that might make decisions on how to sign based on the presence or lack thereof of one or the other fields should be fine if following the spec. As you noted in your comment on the other issue linked, the data for witnessUtxo is trivially calculated from the nonWitnessUtxo so it seems like there shouldn't be any issues. Since the BIP does say that both can be present it seems like something that should at least be allowed.

junderw commented 1 year ago

ie. We have logic that will check the scripts that they match any redeemScript / witnessScript hashes. If we allow both, we will need to handle the case where the witnessUtxo and nonWitnessUtxo are different.

bip174 repo shouldn't catch that, but the Psbt class in this repo should.

etc etc.