Open afilini opened 1 month ago
Changed my bisect scripts slightly, now it also points to: be07a9c9c5283d171ed86b677d601881c7032e89 (maybe the extra generic for the error in translate_pk
?)
Great find! I'll investigate specifically this API.
As I mentioned to you offline, this trait was originally designed as a "Swiss army knife" trait which I hoped to use to do every possible transformation of a Miniscript, from counting keys to lifting to semantic policies to annotating keys. But the result of this is that basically every single call that uses this trait instantiates a whole new copy of a bunch of code.
I think it would make more sense to split this into multiple traits or even to just use a collection of closures for translation.
Looking more closely, some observations:
Translator
trait itself is pretty tight. It has only two generics: a source keytype and a target keytype.Descriptor::at_derivation_index
is also pretty tight: it defines a newtype around a u32
and implements the Translator
trait on it with concrete error types.at_derivation_index
is also not defined on a generic Descriptor
but on a specific Descriptor<DescriptorPublicKey>
.So I'm pretty surprised to see that this particular method is blowing up. Could it be that rustc is making multiple copies of the Derivator
newtype and its Translator
impl? What if you just move that code outside of the at_derivation_index
function so that at_derivation_index
itself is a one-liner?
Alternately, could the real blowup be that by using Translator
you are pulling in crate::Error
and all its bloat, whereas you otherwise wouldn't be?
Ok, looking even more closely, I think the code blowup comes from the Miniscript::translate_pk
function which does a giant match and pulls in all the type-checking code and all its error paths and everything, so even though it's only instantiated once inside at_derivation_index
this one time is really big.
I think this would be resolved by using a context object during construction, which would let us have a slab allocator, which would let us have "direct" translation where we just replaced the keys without touching the hashes or any of the structural properties of the script.
Thank you so much for taking a look!
Alternately, could the real blowup be that by using Translator you are pulling in crate::Error and all its bloat, whereas you otherwise wouldn't be?
Yes! In my "synthetic" test that I used for bisecting (see attached scripts below) it looks like that's where a lot of the increase comes from. In my real firmware unfortunately making that change doesn't have a huge impact.. I'll have to investigate further, it may be that since I'm doing other miniscript operations I already had all the Error
code in my binary.
This is the setup I use to bisect:
Ok I was able to pin down the issue in my firmware, it's the fact that almost all the new translate_pk
methods now go through the various constructors (for example Pkh::new()
) which internally run all the sanity checks.
Constructing those structs directly in translate_pk
shaves off 40KiB and brings it back in line with the older version.
Yep, makes sense. But the existing constructor mechanism is there to prevent translations from converting good scripts to bad ones, so we can't fix the codegen issue this way. We need to do somthing smarter.
Related to #585, I have been investigating what caused the recent code size increase and it looks like it is due to a few different changes, one of them being the switch to use
from_ast
in the translator. For example through bisecting I got it down to this specific commit as one of the offenders: https://github.com/rust-bitcoin/rust-miniscript/commit/fa2e8b41735a2433984d98a0caf23beda57d7ab5For context, when building my firmware commenting out the content of
Descriptor::at_derivation_index()
and replacing it with atodo!()
reduces the overall size by 160KiB! Doing the same on an older version of miniscript (9.0.2) decreases the size by 60KiB.at_derivation_index()
mainly just uses the translator to map keys, and it happens to be the only instance of the translators in use in my code. That's why commenting it out causes LTO to prune all that code and save a lot of space.I tried reverting that
from_ast()
change specifically but unfortunately it didn't make much difference. I'll keep digging but at least we now have a rough idea of where to look...