bitcoindevkit / bdk

A modern, lightweight, descriptor-based wallet library written in Rust!
Other
796 stars 288 forks source link

[chain] It should be easy to get the transaction in canonical order #1333

Open LLFourn opened 5 months ago

LLFourn commented 5 months ago

So as ordered by most recent to oldest vise versa

What do we mean by this?

  1. We could mean that an tx that was confirmed earlier. But what if it was broadcast earlier? Do we want transactions switching order.
  2. We could introduce the concept of first_seen which stores the first time you see the tx. Ordered by first_seen and tiebroken by confirmation height should give a nice consistent list of transactions that doesn't switch order too much. Note this wouldn't change ChangeSet behavior at all since we already have last_seen.
  3. Do transactions that are non-canonical belong in a list? e.g. transactions that were RBF'd.
notmandatory commented 3 months ago

Can we move this to a post-1.0 milestone? it looks like a feature that can be added with a new function without breaking any of the existing apis.

evanlinjin commented 2 weeks ago

last_seen is for conflict resolution of unconfirmed transactions (since we are a wallet and we cannot see the complete mempool). However, sometimes we would like to use when the tx is first_seen in the mempool if the chain source provides this info - i.e. bitcoind RPC. We want to do this as this information is more useful for a user and it means less updating what is persisted.

I think we should rename last_seen to seen_in_mempool, and introduce a created_by_wallet timestamp as well (similar to the first_seen concept, but only set if the local wallet has created the tx). We should still order by seen_in_mempool, but tiebreak with created_by_wallet.

For Esplora/Electrum chain sources, we should make sure that a single update should only introduce one seen_in_mempool value for all unconfirmed txs (use the timestamp that we start sync at). This way, tx order will not jump around too much.

LLFourn commented 1 week ago

last_seen is for conflict resolution of unconfirmed transactions (since we are a wallet and we cannot see the complete mempool). However, sometimes we would like to use when the tx is first_seen in the mempool if the chain source provides this info - i.e. bitcoind RPC. We want to do this as this information is more useful for a user and it means less updating what is persisted.

Is this a proposal to bring first_seen as a new field into the TxGraph changeset? It's an interesting idea but I had earlier dismissed it as not needed to do TxGraph's core job. We do need a way to get a correctly ordered list of unconfirmed transactions somehow though and if we say it's not TxGraph's job then whose job is it?

I think the flaw in my original proposal is that I am assuming that all changesets will be played back one by one. If they are aggregated then you would miss all the earlier last_seens. Rolling with your idea:

// tx_graph.rs

#[must_use]
pub struct ChangeSet<A = ()> {
    /// Added transactions.
    pub txs: BTreeSet<Arc<Transaction>>,
    /// Added txouts.
    pub txouts: BTreeMap<OutPoint, TxOut>,
    /// Added anchors.
    pub anchors: BTreeSet<(A, Txid)>,
    /// Added last-seen unix timestamps of transactions.
    pub seen_in_mempool: BTreeMap<Txid, SeenInMempool>,
}

#[derive(Debug, Clone, PartialEq, Default)]
#[cfg_attr(
    feature = "serde",
    derive(serde::Deserialize, serde::Serialize),
    serde(crate = "serde_crate")
)]
pub struct SeenInMempool {
    first_seen: u64,
    elapsed_since: u64,
}

impl SeenInMempool {
    pub fn last_seen(&self) -> u64 {
        self.first_seen.saturating_add(self.elapsed_since)
    }
}

impl Append for SeenInMempool {
    fn append(&mut self, other: Self) {
        let last_seen = self.last_seen().max(other.last_seen());
        let first_seen = self.first_seen.min(other.first_seen);

        self.first_seen = first_seen;
        self.elapsed_since = last_seen - first_seen;
    }

    fn is_empty(&self) -> bool {
        false
    }
}

The idea that frist_seen monotonically decreases and last_seen() monotonically increases. After thinking about it I think this is by far the best approach. Saying that getting an ordered list of transactions by first_seen is not the job of TxGraph is going to be annoying because it's the place where we would get an ordered list of confirmed transactions (from the graph). I think we should include frist_seen like this otherwise it's going to be a PITA from an API perspective later on.

I think we should rename last_seen to seen_in_mempool, and introduce a created_by_wallet timestamp as well (similar to the first_seen concept, but only set if the local wallet has created the tx). We should still order by seen_in_mempool, but tiebreak with created_by_wallet.

I don't understand motivation for this. Why would it be helpful to tiebreak by created_by_wallet.

evanlinjin commented 5 days ago

@LLFourn My only complaint is that SeenInMempool::elapsed_since is a confusing name (since if we know the time now, we can calculate the time elapsed since first_seen). Maybe a better name would be elapsed_until_last_seen?

The idea with created_by_wallet is based on the assumption that seen_in_mempool would always be the last-seen in mempool timestamp for electrum/esplora. We need a way to order unconfirmed txs with the same seen_in_mempool timestamp. I thought that using when the tx was created would be good for this. However, this would only be good if the wallet was creating it's own transactions.

The downsides of the solution you suggested, is if the caller doesn't sync/scan very frequently, which would result in many unconfirmed txs to have the same first_seen. However, typically a wallet would sync after creating/broadcasting a tx.

Therefore, I'm in favour of your solution.

evanlinjin commented 5 days ago

@LLFourn mentioned that it's good practice to set first_seen straight after broadcast. This is the same concept as created_by_wallet.