rust-bitcoin / rust-miniscript

Support for Miniscript and Output Descriptors for rust-bitcoin
Creative Commons Zero v1.0 Universal
343 stars 135 forks source link

Descriptor checksums #195

Closed darosior closed 3 years ago

darosior commented 3 years ago

I currently have a workaround for importing our descriptors into bitcoind (use getdescriptorinfo before importdescriptors), but was looking to implement actual descriptor support directly here. I think it also makes sense outside of bitcoind-land. What do you think ?

RCasatta commented 3 years ago

I am doing the same workaround (which basically make the checksum useless?), would like the checksum in the lib

sanket1729 commented 3 years ago

As a final goal, we plan to support all descriptors (except combo) that core supports in the specification that core supports. So, we plan to support that eventually.

cc @apoelstra

shesek commented 3 years ago

bdk has a rust implementation of descriptor checksums: https://github.com/bitcoindevkit/bdk/blob/master/src/descriptor/checksum.rs

darosior commented 3 years ago

Ok, i'm going to propose a PR.

I think it makes sense to output checksum with fmt::Display. And have a special method to display the descriptor without the checksum. (or the vice versa where fmt::Display does not output checksum)

Agreed

For parsing, we should check the checksum. I have not looked at the algorithm yet. But it may worth examining how it interacts with miniscript descriptors. i.e what guarantees are still maintained for a large script size. I don't think anything will change, but we should still review the error correction/detection guarantees.

For reference we discussed this on IRC and the detection guarantees are pretty high (https://github.com/bitcoin/bitcoin/blob/master/src/script/descriptor.cpp#L29L47).

bdk has a rust implementation of descriptor checksums: https://github.com/bitcoindevkit/bdk/blob/master/src/descriptor/checksum.rs

Thanks! Will probably steal Alekos' implementation then :D

darosior commented 3 years ago

Arf just read on IRC:

unfortunately the bdk code is not licensed compatibly with rust-miniscript

afilini commented 3 years ago

I'm not really an expert on licenses and stuff, but it would totally be okay for me if you used the implementation we have in bdk. That is itself basically a "translation" of the cpp code in Bitcoin Core.

Is this comment enough to "allow" you to copy it? Should I upstream it myself in a PR?

Also, while I agree that we should support that checksum format to maintain compatibility with Core, I was also thinking that using the string representation of a descriptor could lead to some issues: especially with extended keys, it's definitely possible to represent the same descriptor (meaning that it generates the same addresses) in a few different ways, for instance by adding/removing the key origin in one of the keys. This leads to different checksums for practically the same descriptor.

Do you think it would make sense to have a more "context-aware" checksum standard? Something that can kinda understand what's going on in a descriptor and compare the "meaning" of a descriptor, not just the text encoding.

apoelstra commented 3 years ago

Thanks @afilini! I think if @darosior adds a link to your comment here to his commit message, it'd be fine to relicense stuff.

I'm not a lawyer either, but I think the risk of legal action here is pretty much zero :). But it's good to be sure that we're all on the same page as far as permission to use code.

shesek commented 3 years ago

I was also thinking that using the string representation of a descriptor could lead to some issue .. different checksums for practically the same descriptor.

I stumbled into this question myself in bwt. The key origin example you gave isn't that problematic, because you're at least guaranteed that using the exact same descriptor will result in the same checksum.

But there are some cases where the descriptor won't get encoded through the Display implementation the same way it came in. One example I noticed is h vs ', but there are probably others. The moment these descriptors get deserialized into Descriptor objects, it is no longer possible to recover the original descriptor and its checksum.

One potential way to handle this is to keep the checksum of the original descriptor string around, next to the deserialized Descriptor. But for now, I instead opted to reject checksummed descriptors that aren't encoded in a canonical way, so it at least it fails loudly instead of resulting in odd unexpected behaviour (the user seeing different checksums than these he put in).

(for this to be entirely consistent it should probably reject all non-canonical descriptors, even if they weren't provided with an explicit checksum. but it seemed like this might be a bit too harsh.)