BlockchainCommons / Research

Blockchain Commons Research papers
Other
118 stars 40 forks source link

Consider a compact PSBT based encoding for wallet descriptors #135

Closed seedhammer closed 10 months ago

seedhammer commented 11 months ago

Prompted by discussions at https://github.com/wizardsardine/liana/issues/539, this is a sketch for a BIP proposal for serializing wallet descriptors. I raise it here for comments and because Blockchain Commons is a recognized standards body for Bitcoin specifications. If there's enough interest, it should be fleshed out and proposed as a BIP.

A Go implementation written from scratch is here: https://github.com/seedhammer/bip-serialized-descriptors

At the high level, this is a binary and compact serialization specification for the [wallet-policies] BIP.

Borrowing from BIP174, the format would be

 <desc> := <magic> <global-map> <key-map>*
 <magic> := 0x64 0x65 0x73 0x63 0xFF
 <global-map> := <keypair>* 0x00
 <key-map> := <keypair>* 0x00
 <keypair> := <key> <value>
 <key> := <keylen> <keytype> <keydata>
 <value> := <valuelen> <valuedata>

Where <global-map> contains one or more of the fields:

In future, other script formats (Simplicity?) can be added as separate field types.

The <key-map> is a list of keys, matching the indexed references from the descriptor. Each <key-map> contains one or more of the fields:

FAQ

Why not use the Blockchain Commons [BCR-2020-010] format?

The format was recently deprecated, for good reasons: its binary format is compact but difficult to extend to support extensions to the descriptor format, such as Miniscript.

Why not use the Blockchain Commons Envelope format for descriptors?

The standard is being developed and as such does not have an advantage of existing use.

Why not use CBOR for encoding?

What about QR code representation?

A straightforward encoding would be the use the Blockchain Commons [BCR-2020-005] standard for splitting the serialized descriptor into multiple QR code frames. I believe the general purpose bytes urtype is sufficient, because the magic header decreases the likelihood of misinterpretation.

Doesn't UR support rely on a CBOR implementation anyway?

It's true that the UR specification rely on CBOR for encoding data shards, but it's my understanding that the subset of CBOR required can practically be implemented ad-hoc without a full fledged CBOR library.

Some devices, such as the camera-less Coldcards, won't implement the UR encoding because they exchange serialized descriptors through higher bandwidth mediums such as SD cards, USB, or NFC.

Why not encode the descriptor itself in binary?

The BC envelope format made the same decision of encoding the descriptor in text, for good reasons:

[wallet-policies] https://github.com/bitcoin/bips/blob/bb98f8017a883262e03127ab718514abf4a5e5f9/bip-wallet-policies.mediawiki [BIP174] https://github.com/bitcoin/bips/blob/master/bip-0174.mediawiki [BCR-2020-010] https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-010-output-desc.md [BCR-2020-005] https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-005-ur.md

seedhammer commented 11 months ago

For everyone interested in joining the discussion, we'll be at the next Gordian Developer community meeting Nov. 1. to present this proposal.

pythcoiner commented 11 months ago

cc @bigspider

wolfmcnally commented 11 months ago

@seedhammer Quick note: we deprecate the use of the bytes UR type for any purpose other than testing and demonstration purposes. So we are unlikely to support any proposal that specifies it.

wolfmcnally commented 11 months ago

@seedhammer @ChristopherA See below for my initial response to some of the points raised above.

Why not use the Blockchain Commons Envelope format for descriptors?

Why not use CBOR for encoding?

I think a major sticking point here is in thinking that representing output descriptors in dCBOR/Envelope would require highly general purpose parser/codecs. This is not true. Minimal specifications for output descriptors represented as Gordian Envelope will yield minimal, deterministic binary serializations that could be documented without any reference to CBOR or Envelope, and would be highly suitable for embedded environments. Unlike bespoke formats however, using Envelope provides a forward-looking platform that makes it easy to add support for future enhancements.

wolfmcnally commented 11 months ago

@seedhammer @ChristopherA I'd also like to add that our Rust implementation of Gordian Envelope also provides a lot of "feature gates" for many optional extensions like encryption, which makes it more suitable for use in resource-constrained environments. Users can turn off all the feature gates they don't use, and the Rust compiler will aggressively code strip all unneeded dependencies.

wolfmcnally commented 11 months ago

@seedhammer @ChristopherA To demonstrate what I've been talking about, let's take an example of a minimal encoding for an output descriptor.

In Envelope notation, the subject is just the text of the output descriptor, and it has one assertion declaring its type:

"wpkh([55016b2f/84'/1'/2']xpub6BkiBzPzLUEo9F5n6N4CSKWzFeXdWaKGhYsVNXH8bqfbeAhdpvNeGhu2mP35cABAwDHNpHD5hmXfZcMSdpTUmAyCYnQggXkk9hwbTP9KRRB/<0;1>/*)#cf9l9nxt" [
    'isA': 'OutputDescriptor'
]

Now let's look at the envelope as a tagged CBOR structure:

d8 c8                                    # tag(200) envelope
   82                                    # array(2)
      d8 18                              # tag(24) leaf
         78 9a                           # text(154)
            77706b68... # "wpkh([55016b2f/84'/1'/2']xpub6BkiBzPzLUEo9F5n6N4CSKWzFeXdWaKGhYsVNXH8bqfbeAhdpvNeGhu2mP35cABAwDHNpHD5hmXfZcMSdpTUmAyCYnQggXkk9hwbTP9KRRB/<0;1>/*)#cf9l9nxt"
      a1                                 # map(1)
         01                              # unsigned(1) 'isA'
         1901fb                          # unsigned(507) 'outputDescriptor'

The first two bytes are CBOR tag 200, indicating this is a Gordian Envelope. This tag is now standardized, as it has been registered with IANA. This value is a constant, and can be expected by a minimal parser.

The next byte is 82 indicating a CBOR array of two elements. In this minimal encoding this value will be a constant, and can be expected by a minimal parser.

The next two bytes introduce the first element of the CBOR array, which is a CBOR tagged element tagged with (24). In this minimal encoding this value will be a constant, and can be expected by a minimal parser.

The next two bytes introduce a CBOR text string, and specify its length. The bytes that follow it are the textual representation of the output descriptor. No null terminator is needed.

The remaining bytes are the second element of the array, which is a CBOR map with one entry. The key is the integer 01 representing the known value isA (a type declaration) and the value is the 507 representing the known value OutputDescriptor. As such, this type declaration adds five bytes to the total length of the envelope, and is fixed and can be expected by a minimal parser.

So to a minimal parser, the "magic bytes" are at the beginning (envelope) and the end (output descriptor) and everything amounts to simple to parse, often invariate values.

Now to be fair: if support for hasName or note assertions is required, and the order of these assertions including the isA needs to be determined by doing SHA-256 hashes on each assertion and then sorting them. But the code to perform SHA-256 is small, and sorting an array of hashes is also quite easy in constrained environments.

wolfmcnally commented 11 months ago

So if output descriptors are so simple, why use Envelope to represent them?

Two reasons:

  1. Adding hasName or note fields is just the tip of the iceberg about the sort of metadata that can easily be added using Gordian Envelope.
  2. CBOR-encoded objects (including every envelope) play well with UR encoding and anything that can be UR-encoded can be transmitted via air gaps by QR codes using fountain codes.
ChristopherA commented 11 months ago

I’d like to see the keys externalized (remove xpubs and point to binary keys). I find that having notes possible on the different keys in a multisig useful, for instance maybe a nostr or signal address to contact the keyholder.

wolfmcnally commented 11 months ago

That's all easily doable using Envelope, but the more complex and variable the structure becomes, the more likely you'll want a full dCBOR/Envelope implementation, even if you need nothing more than the base Envelope specification with all the optional feature gates turned off.

seedhammer commented 11 months ago

I've implemented a strawman Go codec to keep the conversation concrete: https://github.com/seedhammer/bip-serialized-descriptors

Quick note: we deprecate the use of the bytes UR type for any purpose other than testing and demonstration purposes. So we are unlikely to support any proposal that specifies it.

Understood. Note that the the particular UR representation is outside the scope of this specification. I mentioned bytes because the format is self-identifying and as such doesn't require a UR type.

For that reason, I'm leaning towards a generic binary, file etc. UR type for the same reason that your envelopes all share the same UR type. I don't oppose a crypto-bipxxx or similar, though.

CBOR is more compact than text, but descriptors are encoded in text and does not enjoy any of the advantages of CBOR.

This appears to also apply to the present proposal.

Almost: this proposal binary encodes xpubs, as referred to in https://github.com/BlockchainCommons/Research/issues/135#issuecomment-1778306970.

I think a major sticking point here is in thinking that representing output descriptors in dCBOR/Envelope would require highly general purpose parser/codecs. This is not true. Minimal specifications for output descriptors represented as Gordian Envelope will yield minimal, deterministic binary serializations that could be documented without any reference to CBOR or Envelope, and would be highly suitable for embedded environments.

It is not clear to me how the enveloped descriptor format is completely specified without referring to neither dCBOR/Envelope, nevermind the requirement that fields be sorted according to SHA256. Can you describe the format? You did offer a thorough example of one instance of an enveloped output descriptor, but that's not the same as specifying the format for every descriptor. The former is sufficient for encoders, but decoders need to cover every possible instance. One example where this matters: your specified minimal encoding for a two-element array (82 in dCBOR) doesn't say anything about the minimal encoding for other array sizes.

To reciprocate your efforts, I implemented most of this proposal: https://github.com/seedhammer/bip-serialized-descriptors. Note in particular how few lines the codec is, assuming you already have access to a PSBT codec. Total around 300 lines, including comments and no external dependencies.

Unlike bespoke formats however, using Envelope provides a forward-looking platform that makes it easy to add support for future enhancements.

This proposal is an existence proof that BIP-174 is extensible. Perhaps not as much as Envelopes, but enough to cover at least two important use cases (PSBTs and now descriptors) and I see no reason the BIP-174 cannot be used for most/all future binary encodings in BIPs.

A specification (or BIP) should ideally be complete and self-contained, and referencing another standard (CBOR) makes implementing, debugging and ensuring interoperability harder. For an example of this complexity, CBOR is not deterministic (I believe the Blockchain Commons is working on a dCBOR proposal to fix that).

It is rare, if not impossible, for a proposed standard to not reference other standards.

It may be rare, but that's besides the point. All things equal, wouldn't you agree that a self-contained specification is superior to a specification with external references?

PSBT is not a standard for anything but itself, and is not put forth as a platform with a future upon which further standards should be based.

This is another major sticking point. I claim that BIP-174 (PSBT) is very much a standard on at least equal footing to what other standards bodies produce.

I further claim that it is more important for widespread use to produce a BIP than to submit proposals to other standards bodies. Regardless of format, a proposal should not be declared complete until significant buy-in from the community has been attained (gauged through bitcoin mailing lists, wallet developers).

Finally, I claim that it's more important to have a simple specification than covering use cases outside of Bitcoin.

Of course, the above claims assume the Bitcoin perspective.

jeandudey commented 11 months ago

Hey!

I currently have been working with the PSBT format and I can't recommend it for other serialization purposes as it was designed with normal processors in mind where memory is freely available, it doesn't really suit embedded use cases as the the PSBT format is complex enough to parse and to keep in memory.

Why not use the Blockchain Commons [BCR-2020-010] format?

The format was recently deprecated, for good reasons: its binary format is compact but difficult to extend to support extensions to the descriptor format, such as Miniscript.

I think this point is not valid, CBOR was designed to be extensible and CDDL too, in fact, one can extend choices in a CBOR type by using the Socket/Plug mechanism which allows each one to extend choices, crypto-output can be adapted to that without issues IMO.

See:

It relies on the CBOR encoding which is a significant complexity in a specification (more on CBOR later).

The CBOR format was made specifically to be small and for embedded devices, there are several implementations of CBOR out there that are very small, for example:

For real world usage of NanoCBOR there is:

Which decodes the standard mentioned before for MCU firmware upgrades.

CBOR is more compact than text, but descriptors are encoded in text and does not enjoy any of the advantages of CBOR.

I agree with this but using the CBOR representation avoids parsing the text which is more difficult to do and does not make it easy to do zero-copy parsing since one has to store decoded bytes into dynamically sized containers.

For instance, this data structure would be impossible to do if parsing from text as one has to additionally keep a separate location in memory to where to copy decoded stuff into, instead of a single one for the descriptor AST. That one reuses the buffer of the CBOR-encoded crypto-output, so basically zero-copy.

Even then I think that with CBOR it is complicated, but without it is even harder.

The PSBT format is already a BIP and widely accepted. A self-contained specification based on an already widely accepted binary format increases our chances of widespread support. In particular, I'd like to see Bitcoin Core support this format.

Yeah but PSBT is very application specific and I doubt that code that parses PSBT could be easily adapted to parse a format based on it since that code assumes PSBT is only for that purpose, it would require refactorings anyway in most code bases that handle the PSBT format.

As a side note, CBOR is also self-contained and self-describing so any application that does not speak the UR standards can decode it, this is a plus for languages like JavaScript or Python while also being small and compatible with embedded devices.

A specification (or BIP) should ideally be complete and self-contained, and referencing another standard (CBOR) makes implementing, debugging and ensuring interoperability harder. For an example of this complexity, CBOR is not deterministic (I believe the Blockchain Commons is working on a dCBOR proposal to fix that).

Nor is the PSBT format, in fact it is hard to serialize the same PSBT just after deserializing it without any modifications, there can exist N versions of a PSBT that represent the very same PSBT, and that's fine IMO since the format doesn't need it, same applies to CBOR.

Unless CBOR is used in a distributed system where all peers need to agree on how some stuff is serialized into CBOR, but that doesn't apply to descriptors.

As a side note, CBOR can be deterministic if you define your rules and the standard even recommends so:

https://www.rfc-editor.org/rfc/rfc8949.html#section-4.2


P.S.: I think defining custom serialization formats for trivial data structures does not help for adoption of Bitcoin and I hope that trend declines.

seedhammer commented 11 months ago

Why not use the Blockchain Commons [BCR-2020-010] format? The format was recently deprecated, for good reasons: its binary format is compact but difficult to extend to support extensions to the descriptor format, such as Miniscript.

I think this point is not valid, CBOR was designed to be extensible and CDDL too, in fact, one can extend choices in a CBOR type by using the Socket/Plug mechanism which allows each one to extend choices, crypto-output can be adapted to that without issues IMO.

Maybe, but is it relevant? This proposal is an alternative to the Envelope output descriptor format that encodes the descriptor itself in text, not in CBOR. Extending the BCR-2020-010 format is another discussion.

It relies on the CBOR encoding which is a significant complexity in a specification (more on CBOR later).

The CBOR format was made specifically to be small and for embedded devices, there are several implementations of CBOR out there that are very small, for example:

* [NanoCBOR](https://github.com/bergzand/NanoCBOR), it has like ~1300 lines of code for both the encoder and decoder, written in C and it is a zero copy parser, so there's no memory allocation overhead.

* [TinyCBOR](https://github.com/intel/tinycbor) which is more complex than NanoCBOR yet small for microcontroller standards, also it's zero-copy and doesn't allocate.

For real world usage of NanoCBOR there is:

* https://github.com/RIOT-OS/RIOT/tree/2b9e82851b1bcf4f977cbca5bb604afde42a1385/sys/suit

Which decodes the standard mentioned before for MCU firmware upgrades.

Sure, but this proposal is even simpler. A codec is 300 lines from scratch, even less if your software already includes a PSBT parser.

In other words, this proposal is clearly an afternoon dependency, whereas I have yet to see specification for bcr-2023-007-envelope-output-desc that doesn't require a third party library, however small or memory light. I'd love to be proved wrong by someone specifying or implementing a parser for BCR-2023-007 from scratch in a reasonable amount of code.

This matters not only for development time and for resource constrained environments, but also for security. 300 lines are easier to review than a CBOR implementation. Don't forget that we're parsing potentially adversarial data!

CBOR is more compact than text, but descriptors are encoded in text and does not enjoy any of the advantages of CBOR.

I agree with this but using the CBOR representation avoids parsing the text which is more difficult to do and does not make it easy to do zero-copy parsing since one has to store decoded bytes into dynamically sized containers.

For instance, this data structure would be impossible to do if parsing from text as one has to additionally keep a separate location in memory to where to copy decoded stuff into, instead of a single one for the descriptor AST. That one reuses the buffer of the CBOR-encoded crypto-output, so basically zero-copy.

I believe this point is irrelevant because BCR-2023-007 is textual, see above.

The PSBT format is already a BIP and widely accepted. A self-contained specification based on an already widely accepted binary format increases our chances of widespread support. In particular, I'd like to see Bitcoin Core support this format.

Yeah but PSBT is very application specific and I doubt that code that parses PSBT could be easily adapted to parse a format based on it since that code assumes PSBT is only for that purpose, it would require refactorings anyway in most code bases that handle the PSBT format.

The strawman implementation is existence proof that the PSBT format is sufficiently general to cover output descriptors.

A specification (or BIP) should ideally be complete and self-contained, and referencing another standard (CBOR) makes implementing, debugging and ensuring interoperability harder. For an example of this complexity, CBOR is not deterministic (I believe the Blockchain Commons is working on a dCBOR proposal to fix that).

Nor is the PSBT format, in fact it is hard to serialize the same PSBT just after deserializing it without any modifications, there can exist N versions of a PSBT that represent the very same PSBT, and that's fine IMO since the format doesn't need it, same applies to CBOR.

I concede your point about determinism. I stand by my point that a self-contained specification is superior to one with references, all things being equal.

jeandudey commented 11 months ago

In other words, this proposal is clearly an afternoon dependency...

Software is made of dependencies of dependencies, I think this doesn't applies to today's world, creating another serialization means just verifying another implementation of something instead of using well-proven solutions.

If dependencies are an issue for you one could use GNU Guix which largely solves that issue, as a fact Bitcoin Core uses it for reproducible builds.

Would have to just review the particular CBOR implementation.

Fun fact, I doubt any true Bitcoiner has reviewed every one of the dependencies, here's how you can build a graph:

guix shell bash graphviz -- bash -c 'guix graph bitcoin-core --type=bag-emerged | dot -Tpdf -o dependencies.pdf'

I'm running it and it'll take a few hours to produce the graph from the very first dependencies, all of the compiler bootstrap chains, all of the GNU utilities, Qt, etc. up to bitcoin-core. I don't think it'll finish generating the graph this day or tomorrow.

We rely on other people's code and that's fine.

The strawman implementation is existence proof that the PSBT format is sufficiently general to cover output descriptors.

The code assumes the existence of an operating system with a malloc implementation like any modern OS, hardly reaches embedded code. I think it's best to show an example of reading the file format using C without a libc or Rust without std/alloc reading the bytes from memory directly.

These are the places where one might see the true limitations of a format like PSBT.

I believe this point is irrelevant because BCR-2023-007 is textual, see above.

I think a crypto-output-like approach is still valid though.

seedhammer commented 11 months ago

I believe this point is irrelevant because BCR-2023-007 is textual, see above.

I think a crypto-output-like approach is still valid though.

I don't think it's worth the decrease in size to have to invent binary representations for every (future) output descriptor feature. This is the reason I think Enveloped Descriptors chose wisely by encoding the descriptor in text.

jeandudey commented 11 months ago

This is the reason I think Enveloped Descriptors chose wisely by encoding the descriptor in text.

It still has to be parsed though, it's the same for both crypto-output and a text representation, either way an application that needs to understand the descriptor needs to parse it into an AST.

wolfmcnally commented 11 months ago

@seedhammer @ChristopherA

I've implemented a strawman Go codec to keep the conversation concrete: https://github.com/seedhammer/bip-serialized-descriptors

As I've shown, a minimal Gordian Envelope containing a string-based output descriptor has only 12 bytes of overhead. Assuming you weren't concerned about parsing the string itself, I'm sure I can produce a encoder and decoder for it in well less than 300 lines of code total (well maybe not in Rust, which is rather verbose). The only variable length element is the length of the string, and implementing a minimal suitable for this application CBOR varint codec is very tiny. Everything else is in deterministic positions.

If you can accept for argument sake that everything I'm saying is true, then why wouldn't we use it? What's the point in having yet another serialization format that offers no standardization and no future expandability?

And yes, I agree with @jeandudey that most dependencies have dependencies of their own, and that is normal and accepted. Any dependency tree is going to have way more internal nodes than leaves: you could be writing in pure C and you'll still want strlen and malloc.

So where do you draw the line about having dependencies and why? I understand that constrained systems need code with fewer dependencies, but I already mentioned that our Rust envelope implementation is feature gated and Rust itself aggressively strips dead code.

Oh, and I'd also like to point out that in CBOR you can tag any CBOR structure. And as long as you define an equivalence between a CBOR numeric tag you choose and a UR type string, you can simply enclose an envelope in your CBOR tag and poof! you have:

ur:my-type/xxxxxx.....

So envelopes don't all have to be ur:envelope. It's up to the developer to decide whether and how to use outer tags or let the envelope contents be self-describing.

wolfmcnally commented 11 months ago

@seedhammer @ChristopherA

You asked how such a structure would be speced without reference to CBOR or Envelope. Let's take a look at the serialized structure again:

d8 c8                                    # tag(200) envelope
   82                                    # array(2)
      d8 18                              # tag(24) leaf
         78 9a                           # text(154)
            77706b68... # "wpkh([55016b2f/84'/1'/2']xpub6BkiBzPzLUEo9F5n6N4CSKWzFeXdWaKGhYsVNXH8bqfbeAhdpvNeGhu2mP35cABAwDHNpHD5hmXfZcMSdpTUmAyCYnQggXkk9hwbTP9KRRB/<0;1>/*)#cf9l9nxt"
      a1                                 # map(1)
         01                              # unsigned(1) 'isA'
         1901fb                          # unsigned(507) 'outputDescriptor'

Here is the complete spec:



Thats it! No mention of Envelope or CBOR at all.

Any code that implements the above spec can read a Gordian Envelope with a minimal Output Descriptor structure (not including name, note, or other metadata). On the constrained platform, due to the deterministic nature of Gordian Envelope there is NO possible variance except for the length of the string.

On the other end, any full-featured Gordian Envelope codec can read the output of a minimal encoder that outputs the above structure.

seedhammer commented 11 months ago

As I've shown, a minimal Gordian Envelope containing a string-based output descriptor has only 12 bytes of overhead. Assuming you weren't concerned about parsing the string itself, I'm sure I can produce a encoder and decoder for it in well less than 300 lines of code total (well maybe not in Rust, which is rather verbose). The only variable length element is the length of the string, and implementing a minimal suitable for this application CBOR varint codec is very tiny. Everything else is in deterministic positions.

I certainly believe a minimal decoder for enveloped descriptors is feasible from scratch. However, is it correctly understood that such decoder will not be able to parse envelopes with metadata, such as name or compact xpubs, even if the decoder don't care about them? See also below.

If you can accept for argument sake that everything I'm saying is true, then why wouldn't we use it? What's the point in having yet another serialization format that offers no standardization and no future expandability?

I don't think "no standardization and no future expandability" is a fair characterization for this proposal. It is existence proof of expandability by being an expansion of PSBT. It is a standard (from the Bitcoin perspective) because the PSBT is a BIP.

You asked how such a structure would be speced without reference to CBOR or Envelope. Let's take a look at the serialized structure again:

d8 c8                                    # tag(200) envelope
   82                                    # array(2)
      d8 18                              # tag(24) leaf
         78 9a                           # text(154)
            77706b68... # "wpkh([55016b2f/84'/1'/2']xpub6BkiBzPzLUEo9F5n6N4CSKWzFeXdWaKGhYsVNXH8bqfbeAhdpvNeGhu2mP35cABAwDHNpHD5hmXfZcMSdpTUmAyCYnQggXkk9hwbTP9KRRB/<0;1>/*)#cf9l9nxt"
      a1                                 # map(1)
         01                              # unsigned(1) 'isA'
         1901fb                          # unsigned(507) 'outputDescriptor'

Here is the complete spec:

* The header bytes MUST be `d8 c8 82 d8 18`,

* If the length of the output descriptor string is:

  * < 24 bytes, the next byte is `60` + length
  * < 256 bytes, the next bytes are `78 NN` where NN is a 1 byte length
  * < 65536 bytes, the next bytes are `79 HH LL` where HH LL are a 2-byte length (big endian),

* The encoded length is followed by the output descriptor string with no terminator,

* The suffix bytes MUST be `a1 01 19 01 fb`.

Thats it! No mention of Envelope or CBOR at all.

Any code that implements the above spec can read a Gordian Envelope with a minimal Output Descriptor structure (not including name, note, or other metadata).

The leaving out of name, note or other metadata is key. Will a minimal decoder be able to successfully extract the descriptor from every envelope, including those with metadata?

On the other end, any full-featured Gordian Envelope codec can read the output of a minimal encoder that outputs the above structure.

Absolutely. Encoding is much easier to do from scratch, because an encoder need only implement what it needs.

seedhammer commented 11 months ago

Thank you for a great meeting today! I look forward to @wolfmcnally's simplified specification for a CBOR-based output descriptor format.

To make it easier to compare features and because @wolfmcnally asked for the binary representation of this proposal, I've restructured my demo implementation to be usable from the Go playground.

https://go.dev/play/p/nouZlbbcEWt is a copy of main.go from the implementation. It outputs the binary representation of a multisig descriptor with 3 keys, along with a textual dump for comparison:


Serialized descriptor (length 397):
64657363ff01010f5361746f736869277320537461736801003477736828736f727465646d756c746928322c40302f3c303b313e2f2a2c40312f3c303b313e2f2a2c40322f3c303b313e2f2a29290053000488b21e0418f8c2e7800000026b3a4cfb6a45f6305efe6e0e976b5d26ba27f7c344d7fc7abef7be2d06d52dfd021c0b479ecf6e67713ddf0c43b634592f51c037b6f951fb1dc6361a98b1e5735e0f8b515314dc567276480000800000008000000080020000800053000488b21e04221eb5a080000002c887c72d9d8ac29cddd5b2b060e8b0239039a149c784abe6079e24445db4aa8a0397fcf2274abd243d42d42d3c248608c6d1935efca46138afef43af08e971289657009d2b14f245ae38480000800000008000000080020000800053000488b21e041c0ae906800000025afed56d755c088320ec9bc6acd84d33737b580083759e0a0ff8f26e429e0b77028342f5f7773f6fab374e1c2d3ccdba26bc0933fc4f63828b662b4357e4cc3791bec0fbd814c5d872974800008000000080000000800200008000

Decoded descriptor:
Name: Satoshi's Stash
Descriptor: wsh(sortedmulti(2,@0/<0;1>/*,@1/<0;1>/*,@2/<0;1>/*))
xpub: [dc567276/48h/0h/0h/2h]xpub6DiYrfRwNnjeX4vHsWMajJVFKrbEEnu8gAW9vDuQzgTWEsEHE16sGWeXXUV1LBWQE1yCTmeprSNcqZ3W74hqVdgDbtYHUv3eM4W2TEUhpan
xpub: [f245ae38/48h/0h/0h/2h]xpub6DnT4E1fT8VxuAZW29avMjr5i99aYTHBp9d7fiLnpL5t4JEprQqPMbTw7k7rh5tZZ2F5g8PJpssqrZoebzBChaiJrmEvWwUTEMAbHsY39Ge
xpub: [c5d87297/48h/0h/0h/2h]xpub6DjrnfAyuonMaboEb3ZQZzhQ2ZEgaKV2r64BFmqymZqJqviLTe1JzMr2X2RfQF892RH7MyYUbcy77R7pPu1P71xoj8cDUMNhAMGYzKR4noZ

Note that the playground is live, so you can edit the example and run it to play with the format without having Go installed.

seedhammer commented 11 months ago

Here's a summary of the conclusions from today's Gordian meeting for people not that didn't attend or plan to watch the recording is below. @wolfmcnally et al, please correct any misunderstandings.

We agreed that

With respect to this proposal, @wolfmcnally is confident that a specification (and reference implementation) can be produced that is both CBOR compatible and fully specified such that an implementation can be written without depending on a (d)CBOR library. Quite possibly the format will no longer be in envelope form.

@wolfmcnally will work on such a proposal as Commons' time and priorities allow.

We all seemed to agree that, assuming such specification is produced and that it can be proposed as a BIP, is the best of both worlds: easy to use given a CBOR library, but also simple enough to read and write from scratch.

kloaec commented 10 months ago

Just a quick comment here to say that while output descriptors aren't always part of the user backup stash, they should. Even for single sig wallets.

The discussion mentions Miniscript but it seems to me that the examples are still quite restricted to multisig. Many projects are now using Miniscript in prod, usually not for simple multisig. And obviously as we get away from the simple multisig, the critical need for output descriptor backups is made even more obvious. A simple example for Liana: you can have multiple paths with different timelocks, of arbitrary values each. With any number of keys each. Miniscript wallets are: Liana, MyCitadel, Trident, Fedi, and probably many more.

Miniscript also still evolves, is a bit different when used as MiniTapscript, etc.

For us at Wizardsardine, it's super important to be able to offer a way to back up these descriptors long term. For airgaped wallet manufacturers it's about easily/efficiently importing the descriptor.

Users are already using wallets like Liana, and are currently backing up text files. We are really looking forward to see where this thread is going, and hope to see an implementation soon! Metal plate engraved or hammered Miniscript descriptors are definitely a missing part of our toolset.

ChristopherA commented 10 months ago

We also believe that backing up the descriptor is increasingly important, even singlesig. This has been one of the driving requirements that drove moving from SSKR for only seeds, to Gordian Envelopes for everything.

Right now only our reference app, Gordian Seed Tool on iOS/Mac in public beta TestFlight supports this, but we hope more wallets will soon.

seedhammer commented 10 months ago

Right now only our reference app, Gordian Seed Tool on iOS/Mac in public beta TestFlight supports this, but we hope more wallets will soon.

@ChristopherA does "we hope more wallets will soon" mean that BCR-2023-007 ("Envelope Bitcoin Output Descriptors") is final from the perspective of Blockchain Commons? If not, can you comment on the progress of the work mentioned in https://github.com/BlockchainCommons/Research/issues/135#issuecomment-1789644032?

wolfmcnally commented 10 months ago

@seedhammer I have been working on another task the past couple weeks, but I am now getting down to the new specification we've been discussing and will have more to report soon!

seedhammer commented 10 months ago

Prompted by https://github.com/coinkite/BBQr/issues/1#issuecomment-1824920970 I posted this proposal to the bitcoin-dev mailing: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2023-November/022184.html.

seedhammer commented 10 months ago

Prompted by coinkite/BBQr#1 (comment) I posted this proposal to the bitcoin-dev mailing: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2023-November/022184.html.

The mailing list post asked whether to extend PSBT rather than invent a new format. A response supports that idea:

I think the goal of such a format should be that it is already a valid PSBT, or can be trivially converted into one. Ideally, a coordinating device can load the standardized descriptor file, add inputs (PSBTv2) or unsigned TX (PSBTv1), and send it to compatible signing devices without further modification.

seedhammer commented 10 months ago

To summarize today's meeting (inaccuracies are mine!):

From our perspective:

We still believe PSBT is the superior encoding for output descriptors for the following reasons:

On that basis, we intend to:

I'm closing this issue: BC has considered and rejected a PSBT based encoding of output descriptors.

seedhammer commented 9 months ago

The first draft is here: https://github.com/seedhammer/bips/blob/master/bip-psbt-descriptors.mediawiki. Also posted to the bitcoin-dev mailing list.