Open apoelstra opened 2 years ago
I will explore this weekend. At the very least, we should be able to copy the algorithms.
A couple other related ideas bumping around in my brain:
MiniscriptView
/PolicyView
etc type which is a tree that we users can use to pattern-match with etc, but which can't be directly constructed (so that you can't construct Miniscript programs that don't typecheck)&
-pointers rather than Box
-pointers which makes matching much much easier.So on a high level here's what I'm imagining ... basically Policy
becomes backed by a Vec
of PolicyNode
s (the blog post uses "Layer" for what we call "Node"); each PolicyNode
is an enum where the variants contain &
-refs to other PolicyNodes
. These &
-refs will point into the same vector; we will use Pin
somehow to achieve this. Then we have a .view(&self) -> &PolicyNode
method which returns a ref to the topmost PolicyNode
which the user can easily match on.
Obviously the devil is in the details :)
I think the Layer
terminology comes from https://blog.sumtypeofway.com/posts/introduction-to-recursion-schemes.html which the recursion
crate is based on, and is probably worth reading in its entirety (even though it's a 5-part series, written in Haskell, and uses Haskell features that don't nicely map to Rust)
cc @sipa in case you are interested in any of this (if not feel free to unsubscribe :P)
The C++ miniscript implementation has generic "run a recursive algorithm on a miniscript tree" code (specified in the form of "here is a function to compute an associated piece of data for each node based on its parent's associated data", and a "here is a function to compute the result value of a parent given the associated data and result value of its children"), which seems to work very well (pretty much all recursive functions are implemented using those; including ToString, ToScript, the duplicate key check, finding the smallest badly-typed subtree for error reporting, signing logic, checking satisfiability).
See https://github.com/bitcoin/bitcoin/blob/v24.0rc2/src/script/miniscript.h#L324L408 for generic tree-walking code if you're interested. More code using it in https://github.com/sipa/miniscript/blob/master/bitcoin/script/miniscript.h.
Not sure how relevant this is for the issue here, but it may be an inspiration for algorithms that are free of unbounded stack space usage, especially with tapscript-miniscript potentially increasing maximum recursion depth significantly.
Sounds good, @sipa. I think if we continue to use naive recursive data structures this is the way we'll wind up going. Though there are a few limitations -- one is about efficiency, since it's slow to make a zillion tiny allocations/deallocations and also bad for cache locality -- but another, more Rust-specific, is about the ergonomics of matching on recursive structures.
Then there is the issue that we'd like to expose library code to allow construction and pattern-matching on arbitrary miniscripts without allowing people to construct things that don't typecheck (since these will eventually trigger assertions if you try to poke them too much).
Adding one more link to this mess of an issue: this reddit post about generalizing over Rc and Arc has an interesting "double-trait" trick which lets you get something like an infinite type which may allow us to create a fixpoint type analogous to the one in the Haskell blog post.
Probably relevant: https://smallcultfollowing.com/babysteps/blog/2022/06/27/many-modes-a-gats-pattern/
(Sorry, now I'm just using this issue as a general "how can we better leverage our type system" notepad.)
After spending a few hours on this, I think
Miniscript
structure (and Policy
etc would work the same). Annoyingly we have to use a raw pointer ... using a Vec<T>
, Box<[T]>
or even a *mut [T]
all run into limitations related to the "stacked borrows" aliasing rules in Rust, either because things are forbidden or they're simply undefined-as-of-now.offset_from
method on pointers (basically, pointer subtraction) wasn't added til Rust 1.46. So if you want to mess around with pointers -- e.g. to map over all nodes in a tree and reconstituting the pointers in the map output -- you have to cast to integers, do math there, and cast back ... which again runs into poorly-defined areas of stacked borrows.Basically let's just revisit this after the next MSRV bump, and if we're lucky we'll have a better-defined language by then. Then we should start by defining a generic slab-backed tree type, use these Haskell blog posts as inspiration for what APIs we want to define on that, and then try porting our various trees into it. It wouldn't surprise me if we can eventually 10x the performance of this library while making it easier to add functionality.
As an aside, one way I considered to bypass Rust's memory model is to drop into C :}. But because we're trying to construct complicated Rust enums it would be very difficult to pass these things across the FFI barrier, not to mention the heavy optimization cost we'd face.
Relevant (about ergonomically bundling a slab with a recursive type that points into it) https://old.reddit.com/r/rust/comments/zc8pq0/selfreferential_types_for_fun_and_profit/izp774p/
This one is also highly relevant to rust-miniscript and its context parameter (and most of my attempts to de-recursive-ify the data structures):
See this blog post series: https://recursion.wtf/posts/rust_schemes_3/ which describes a scheme for managing recursive data structures which is likely to be (much) faster than our existing code and also avoid stack overflows.