rust-bitcoin / rust-miniscript

Support for Miniscript and Output Descriptors for rust-bitcoin
Creative Commons Zero v1.0 Universal
351 stars 138 forks source link

explore 'recursion' crate, or at least copy its algorithms #484

Open apoelstra opened 2 years ago

apoelstra commented 2 years ago

See this blog post series: https://recursion.wtf/posts/rust_schemes_3/ which describes a scheme for managing recursive data structures which is likely to be (much) faster than our existing code and also avoid stack overflows.

sanket1729 commented 2 years ago

I will explore this weekend. At the very least, we should be able to copy the algorithms.

apoelstra commented 2 years ago

A couple other related ideas bumping around in my brain:

So on a high level here's what I'm imagining ... basically Policy becomes backed by a Vec of PolicyNodes (the blog post uses "Layer" for what we call "Node"); each PolicyNode is an enum where the variants contain &-refs to other PolicyNodes. These &-refs will point into the same vector; we will use Pin somehow to achieve this. Then we have a .view(&self) -> &PolicyNode method which returns a ref to the topmost PolicyNode which the user can easily match on.

Obviously the devil is in the details :)

apoelstra commented 2 years ago

I think the Layer terminology comes from https://blog.sumtypeofway.com/posts/introduction-to-recursion-schemes.html which the recursion crate is based on, and is probably worth reading in its entirety (even though it's a 5-part series, written in Haskell, and uses Haskell features that don't nicely map to Rust)

apoelstra commented 2 years ago

cc @sipa in case you are interested in any of this (if not feel free to unsubscribe :P)

sipa commented 2 years ago

The C++ miniscript implementation has generic "run a recursive algorithm on a miniscript tree" code (specified in the form of "here is a function to compute an associated piece of data for each node based on its parent's associated data", and a "here is a function to compute the result value of a parent given the associated data and result value of its children"), which seems to work very well (pretty much all recursive functions are implemented using those; including ToString, ToScript, the duplicate key check, finding the smallest badly-typed subtree for error reporting, signing logic, checking satisfiability).

See https://github.com/bitcoin/bitcoin/blob/v24.0rc2/src/script/miniscript.h#L324L408 for generic tree-walking code if you're interested. More code using it in https://github.com/sipa/miniscript/blob/master/bitcoin/script/miniscript.h.

Not sure how relevant this is for the issue here, but it may be an inspiration for algorithms that are free of unbounded stack space usage, especially with tapscript-miniscript potentially increasing maximum recursion depth significantly.

apoelstra commented 2 years ago

Sounds good, @sipa. I think if we continue to use naive recursive data structures this is the way we'll wind up going. Though there are a few limitations -- one is about efficiency, since it's slow to make a zillion tiny allocations/deallocations and also bad for cache locality -- but another, more Rust-specific, is about the ergonomics of matching on recursive structures.

Then there is the issue that we'd like to expose library code to allow construction and pattern-matching on arbitrary miniscripts without allowing people to construct things that don't typecheck (since these will eventually trigger assertions if you try to poke them too much).

apoelstra commented 2 years ago

Adding one more link to this mess of an issue: this reddit post about generalizing over Rc and Arc has an interesting "double-trait" trick which lets you get something like an infinite type which may allow us to create a fixpoint type analogous to the one in the Haskell blog post.

apoelstra commented 2 years ago

Probably relevant: https://smallcultfollowing.com/babysteps/blog/2022/06/27/many-modes-a-gats-pattern/

(Sorry, now I'm just using this issue as a general "how can we better leverage our type system" notepad.)

apoelstra commented 1 year ago

After spending a few hours on this, I think

Basically let's just revisit this after the next MSRV bump, and if we're lucky we'll have a better-defined language by then. Then we should start by defining a generic slab-backed tree type, use these Haskell blog posts as inspiration for what APIs we want to define on that, and then try porting our various trees into it. It wouldn't surprise me if we can eventually 10x the performance of this library while making it easier to add functionality.

As an aside, one way I considered to bypass Rust's memory model is to drop into C :}. But because we're trying to construct complicated Rust enums it would be very difficult to pass these things across the FFI barrier, not to mention the heavy optimization cost we'd face.

apoelstra commented 1 year ago

Relevant (about ergonomically bundling a slab with a recursive type that points into it) https://old.reddit.com/r/rust/comments/zc8pq0/selfreferential_types_for_fun_and_profit/izp774p/

apoelstra commented 2 months ago

This one is also highly relevant to rust-miniscript and its context parameter (and most of my attempts to de-recursive-ify the data structures):

http://ais523.me.uk/blog/scoped-generics.html