Closed Kixunil closed 9 months ago
I think it'd be confusing and annoying to just rename it, especially for people who don't care about performance or who aren't using obscenely-large transactions.
We could try closing the Transaction
structure and then we could cache it.
I remember very well first seeing the API years ago that I was confused and didn't know if it actually is cheap to call or not.
When I first used rust-bitcoin
I only discovered it when optimizing performance.
It's obvious when you understand what's happening under the hood, but when you are new rust-bitcoin
user or you're are just focused on something else, it's very easy to overlook.
If a function is not a cheap getter, it should just have a verb in a name.
I think it'd be confusing and annoying to just rename it,
To help migration: the old name could be kept at first with a comment, then deprecated with a message to use a new name, then removed completely.
Other than that I don't think couple of letters more make a big difference.
I'm always tempted to name things like this backwards (txid_compute
) so the LSP auto-completion does the right thing for the user that intuitively types tx.txid
.
We could try closing the Transaction structure and then we could cache it.
That requires an extra field and either:
&mut Self
which is IMO more constraining and annoying,Sync
or something like that.It's annoying but likely the correct type of annoyance.
@dpc IIRC LSP can see doc(alias = "txid")
Regarding caching, we could use a 32-byte array of atomics (or packing it into u64
atomics) and an atomic that txid is set. Notice that txid doesn't change if transaction didn't so multiple threads would write the same data which is fine.
However there's a much bigger problem: invalidation. It'd mean literally no mutable references to the inside except for Witness
(an if we're caching wtxid too then not even those). There would have to be a setter for everything and at best we might be able to use proxy objects to make it look a bit better. I think this is unacceptable.
It'd mean literally no mutable references to the inside except for
Witness
(an if we're caching wtxid too then not even those).
Well, any accessor to mutable references could just turn off the "txid is set" bit.
Hmm, OK, that could work...
I'm tempted to at least try it and maybe benchmark it.
Plausibly we could even avoid the atomics, with a bit of ugliness. We'd have:
txid
(and wtxid
etc)TxInRef
or whatever custom type, whose destructors would recompute the txid.The main issue here is that *Ref
types are really annoying to work with (e.g. the type returned by Mutex
or RefCell
. So it is probably better to move recomputation into txid
using atomics. (But ofc then there is a perf hit from the use of atomics.
Another solution would be to split the type into an immutable Transaction
with all cached data, and a mutable TransactionConstructor
which only had an expensive compute_txid
method.
Also, flagging that we have a similar issue with blockhashes and the Block
type, which is a POD type but whose hash requires us to hash all contained transactions.
But ofc then there is a perf hit from the use of atomics.
I suspect that recomputing Txid every time you change the transaction regardless of whether you need it would be a much bigger perf hit.
Immutable/mutable split sounds similar to my idea of FrozenTransaction
just different names.
Immutable/mutable split sounds similar to my idea of
FrozenTransaction
just different names.
Nice. I think we should give the short name to the frozen variant because transactions rarely change once they're constructed.
So move the current thing to be TransactionMut
? I'd do it after abstract transactions so that we can deduplicate stuff.
@dpc IIRC LSP can see
doc(alias = "txid")
That is great to know. Thanks!
Also, flagging that we have a similar issue with blockhashes and the
Block
type, which is a POD type but whose hash requires us to hash all contained transactions.
There is a thousands ways one might want organize computation of things like txids, or handling blockchain data. Trying to come up with complications to anticipate them is just a loosing battle, IMO. If you have N versions of e.g. Transactions
then there's a need to make them interoperable, introduce traits to glue them and complexity keeps growing. In addition one might want to do zero-copy deserialization (there was already PR like that around), or store things in data-oriented architecture, etc.
It seems more practical to focus on the most common, basic and natural type to be used in most software for interoperability, let the users deal with txid
as they see fit (e.g. pass (TxId, Transaction)
in the APIs and leave fancy stuff to extra libraries or the future.
The issue here is that txid
being CPU heavy and could use a name adjustment to make it more obvious. How did it lead to ... TransactionMut
? :D
If anyone just wants to cache txid()
they can wrap Transaction
in a struct, add field for a cache and deal with it however they need. Not exactly difficult. It's easier to add functionality that is needed, then remove one that one doesn't want.
Yeah, I'm not particularly happy about any of the solutions. I mainly discussed hoping that maybe we can figure out something sensible.
Yeah, I'm not particularly happy about any of the solutions. I mainly discussed hoping that maybe we can figure out something sensible.
depricating txid
in favor of compute_txid
seems like a good start. I agree that txid
sounds like a simple cheap operation while compute_txid
is a better name for what's going on.
I just had an idea that there could be another motivation for FrozenTransaction
: supporting more efficient layout similar to that of Witness
.
However, I don't believe this is worth changing Transaction
for 1.0.
I think that the whole transaction
module and its dependencies are quite battle-tested and widely used in multiple crates. Whatever crazy ideas we come up with can be added as another extension crate, especially after supporting abstract transactions. I really want to focus on 1.0 of the leaf crates.
@dpc
There is a thousands ways one might want organize computation of things like txids, or handling blockchain data. Trying to come up with complications to anticipate them is just a loosing battle, IMO.
Bitcoin Core has CTransaction
and CMutableTransaction
and this has worked fine for them for many years, even though C++ has a much weaker type system which can obfuscate what's going on.
In this crate we have Script
(now ScriptBuf
) as well as script::Builder
. This is a very common idiom in Rust. If we also had transaction::Builder
and block::Builder
this would be nicely consistent.
@Kixunil
However, I don't believe this is worth changing
Transaction
for 1.0.
You may be right. Renaming txid
to compute_txid
, providing a deprecation cycle, and using #[doc(alias)]
is easy and has little room for bikeshedding (it sounds like we've agreed on the name, at least :)). And it would be an improvement over the status quo.
Moving to a hyper-efficient builder model feels like there's a lot of design space. We could store the flat bytes of the transaction along with offsets, and/or tweak things to improve alignment, and/or exploit system endianness to cast bytes to u32, etc etc. And how would transaction::Builder
interact with PSBT? Seems like it should, but damned if I know how. And this can be easily done in a parallel crate and later pulled into 2.0 if this makes sense.
The name
txid
looks like a simple getter. I remember very well first seeing the API years ago that I was confused and didn't know if it actually is cheap to call or not.Alternatively we could have something like
struct FrozenTransaction(Transaction, Txid)
for people who want to cache it but I'm not sure about its viability.See https://github.com/rust-bitcoin/rust-bitcoin/issues/2358#issuecomment-1900892424