Lily-Technologies / lily-wallet

The best way to secure your bitcoin.
https://lily-wallet.com
Other
94 stars 15 forks source link

Improve performance by adding a txs cache #38

Open bilthon opened 4 years ago

bilthon commented 4 years ago

I see that even with a wallet with very few transactions (3 in my case) the app takes ~10 seconds to be able to provide a fresh address when using the block explorer as the provider.

This happens because the app has to scan until the end of the address gap in the scanForAddressesAndTransactions function before being ready.

In a previous project I used the dexie library to create a sort of tx cache that would allow the app to start right away, from cached information from previous scans. The scan can then happen in the background and the user would just get an update in case there's something new.

What would be your thoughts around this? Maybe Dexie (which uses the indexedDB) could be replaced by sqlite.

KayBeSee commented 4 years ago

I agree the load time for accounts is really slow.

I am hesitant about adding a full-blown database, but I just googled IndexDB and looked at dexie and this seems like a really good middle ground approach. Is there a way to store data in IndexedDB encrypted? I'm not seeing anything about it.

Another approach would be just saving an encrypted json file with addresses and possibly transactions similar to how the config is getting written and retrieved in next.

bilthon commented 4 years ago

Regarding encryption, I guess that we could do it, but then would lose the indexing capabilities (at least if everything is encrypted). Do you think encryption would really be required for this data though? the tx history is publicly available in the blockchain anyways.

I guess there could be some privacy concerns, but then again all of this was obtained by hitting a public block explorer (when not using a node at least).

vesparny commented 4 years ago

I have done what @bilthon proposed in a react native app I am working on.

  1. addresses are always scanned to check for balances using a gap limit of 20. Checked addresses, as well as las free internal and external indexes, are stored in AsyncStorage
  2. txs are fetched only until internal and external indexes (so the scan stops normally way before the gap limit is reached) and only if address balances did change since the previous scan
bilthon commented 4 years ago

@vesparny and did you make use of the IndexedDB ?

vesparny commented 4 years ago

Nope, AsyncStorage on react native

vesparny commented 3 years ago

I want to leave here the implementation I use to cache transactions. this also improves the code to better distinguish between received and sent transactions. I do not have time for a PR but I am happy to share what I came up with.

I do not currently use this code becaue in the meanwhile I headed toward a similar implementation based on electrumX, similar to what described in #5

I basically scan every address during the fetchBalance phase, then I cache all the address and keep a list of addresses where the balance differs since previous fetch. This way I know the addresses I need to re-scan while fetching transactions.

@KayBeSee let me know if the code is not clear

// explorer-api.js
const client = require('./fetch-client')

const EXPLORER_URL = 'https://blockstream.info/api'

const getUtxosForAddress = async address => {
  return await client(`${EXPLORER_URL}/address/${address}/utxo`)
}

const getTxsForAddress = async address => {
  return await client(`${EXPLORER_URL}/address/${address}/txs`)
}

const getAddress = async address => {
  return await client(`${EXPLORER_URL}/address/${address}`)
}

const broadcastTx = async txHex => {
  return await client(`${EXPLORER_URL}/tx`, txHex)
}

const getTipHeight = async () => {
  return await client(`${EXPLORER_URL}/blocks/tip/height`)
}

module.exports = {
  getAddress,
  getUtxosForAddress,
  getTxsForAddress,
  broadcastTx,
  getTipHeight
}

const api = require('./explorer-api')
const bitcoin = require('./bitcoin')

const serializeTransactions = (allTxs, addresses) => {
  const unconfirmedTxs = []
  const confirmedTxs = []
  for (const tx of allTxs) {
    const ownedInputAmount = tx.vin.reduce((acc, val) => {
      const value =
        addresses.indexOf(val.prevout.scriptpubkey_address) > -1
          ? val.prevout.value
          : 0
      acc += value
      return acc
    }, 0)

    const ownedOutputAmount = tx.vout.reduce((acc, val) => {
      const value =
        addresses.indexOf(val.scriptpubkey_address) > -1 ? val.value : 0
      acc += value
      return acc
    }, 0)

    const enhancedTx = {
      ...tx,
      type: ownedOutputAmount - ownedInputAmount > 0 ? 'received' : 'sent',
      value: Math.abs(ownedOutputAmount - ownedInputAmount)
    }
    enhancedTx.status.block_time
      ? confirmedTxs.push(enhancedTx)
      : unconfirmedTxs.push(enhancedTx)
  }
  unconfirmedTxs.sort((a, b) => b.txid - a.txid)
  confirmedTxs.sort((a, b) => b.status.block_time - a.status.block_time)
  return [...unconfirmedTxs, ...confirmedTxs]
}

const getTransactionsFromAddress = async address => {
  const res = await api.getTxsForAddress(address)
  return res
}

const getAddressBalance = async address => {
  const res = await api.getAddress(address)
  return {
    confirmed: res.chain_stats.funded_txo_sum - res.chain_stats.spent_txo_sum,
    unconfirmed:
      res.mempool_stats.funded_txo_sum - res.mempool_stats.spent_txo_sum,
    confirmedTxCount: res.chain_stats.tx_count,
    unconfirmedTxCount: res.mempool_stats.tx_count
  }
}

const findCachedAddressAtIndex = (cachedAddresses, index) => {
  return Object.keys(cachedAddresses)
    .reduce(
      (acc, val) => [
        ...acc,
        ...[
          {
            address: cachedAddresses[val].address,
            index: cachedAddresses[val].index,
            balance: cachedAddresses[val].balance
          }
        ]
      ],
      []
    )
    .find(el => el.index === index)
}
const scanForAddressesAndTransactions = async (
  addressesToBeScanned,
  cachedAddresses,
  freeAddressIndex
) => {
  let txs = []
  let i = 0

  while (i < freeAddressIndex) {
    const cachedAddress = findCachedAddressAtIndex(cachedAddresses, i)
    if (addressesToBeScanned.indexOf(cachedAddress.address) > -1) {
      // address balance changed since last time we checked
      // scan for txs
      const transactions = await getTransactionsFromAddress(
        cachedAddress.address
      )
      txs = [...txs, ...transactions]
    }

    i += 1
  }
  return txs
}

const scanForAddresses = async (
  zpub,
  cachedAddresses,
  gapLimit,
  isChange = false
) => {
  let addresses = []
  let freeAddressIndexArray = []
  let gap = 0
  let i = 0

  while (gap < gapLimit) {
    const cachedAddress = findCachedAddressAtIndex(cachedAddresses, i)
    const address = cachedAddress
      ? cachedAddress.address
      : bitcoin.getAddressFromXPub(zpub, i, isChange)
    const balance = await getAddressBalance(address)
    addresses = [...addresses, ...[{ address, balance, index: i }]]
    if (balance.confirmed === 0 && balance.unconfirmed === 0) {
      gap += 1
      freeAddressIndexArray = [...freeAddressIndexArray, ...[i]].slice(
        gapLimit * -1
      )
    } else {
      gap = 0
    }

    i += 1
  }
  return {
    addresses,
    freeAddressIndex:
      typeof freeAddressIndexArray[0] !== 'undefined'
        ? freeAddressIndexArray[0]
        : 0
  }
}

const getTransactionsFromXPub = async (
  addressesToBeScanned,
  internalAddresses,
  externalAddresses,
  freeInternalAddressIndex,
  freeExternalAddressIndex
) => {
  const internalTransactions = await scanForAddressesAndTransactions(
    addressesToBeScanned,
    internalAddresses,
    freeInternalAddressIndex
  )
  const externalTransactions = await scanForAddressesAndTransactions(
    addressesToBeScanned,
    externalAddresses,
    freeExternalAddressIndex
  )
  const organizedTransactions = serializeTransactions(
    [...new Set([...externalTransactions, ...internalTransactions])],
    [...Object.keys(internalAddresses), ...Object.keys(externalAddresses)]
  )
  return organizedTransactions.map(tx => {
    return {
      txid: tx.txid,
      status: {
        block_time: tx.status.block_time,
        confirmed: tx.status.confirmed,
        block_height: tx.status.block_height
      },
      type: tx.type,
      value: tx.value
    }
  })
}

const getBalanceFromXPub = async (
  zpub,
  internalAddresses,
  externalAddresses,
  gapLimit
) => {
  const internalAddressesScan = await scanForAddresses(
    zpub,
    internalAddresses,
    gapLimit,
    true
  )
  const externalAddressesScan = await scanForAddresses(
    zpub,
    externalAddresses,
    gapLimit
  )
  const mergedAddresses = [
    ...internalAddressesScan.addresses,
    ...externalAddressesScan.addresses
  ]
  const totals = mergedAddresses.reduce(
    (acc, val) => {
      acc.confirmed = acc.confirmed + val.balance.confirmed
      acc.unconfirmed = acc.unconfirmed + val.balance.unconfirmed
      return acc
    },
    { confirmed: 0, unconfirmed: 0 }
  )
  return {
    confirmed: totals.confirmed,
    unconfirmed: totals.unconfirmed,
    scannedInternalAddresses: internalAddressesScan.addresses,
    scannedExternalAddresses: externalAddressesScan.addresses,
    freeInternalAddressIndex: internalAddressesScan.freeAddressIndex,
    freeExternalAddressIndex: externalAddressesScan.freeAddressIndex
  }
}

module.exports = {
  getTransactionsFromXPub,
  getBalanceFromXPub,
  getAddressBalance
}
bilthon commented 3 years ago

Interesting @vesparny , I suppose cachedAddresses in there is the object you store in the AsyncStorage, right?

This will work, however I would find it preferable to make use of database-like engine instead of AsyncStorage, which I believe is a key-value storage, right?

The indexed db in this case would allow for more complex queries to be performed more efficiently. Correct me if I'm wrong, but I think that you're loading all the user's txs in memory here. If that's the case it might work fine for simple wallets, but will become increasingly costly as the wallet usage increases.

But regarding this feature, I'd also like to hear from @KayBeSee as I noted this line in the features in the readme file:

Stateless: There is no database. The app is populated from a password encrypted configuration file

This will of course not be the case if we were to implement this. The possibility of encrypting this data was also raised, but in that case I think we lose the querying capabilities the indexed db would allow for. My question would be, is it really important that the wallet is stateless and doesn't leave a footprint like a tx history? I get it that that's a good privacy bonus, but it would come at odds with usability, what do you guys think?

KayBeSee commented 3 years ago

I would love to store previously queried data on Lily since I am noticing too that it takes a while to startup and load all the transaction history and UTXOs for my wallets.

The most rudimentary way of doing this without modifying too much code would be to store the accountMap object that gets constructed at startup as a JSON or encrypted text file like the config and just load that at the start like the config file.

Once that's loaded then start making network calls starting from "tail" points like the last address with a transaction and addresses with UTXOs connected to them from the previous lookup. Once those points have been scanned then we can dive deeper to older addresses just to make sure no funds have been sent there.

I am leaning towards the AsyncStorage idea just because React-Native uses that as well and so I can reuse a lot of the code for the mobile app. I could be persuaded to use a more robust database solution though. What kind of queries would we need an indexed db over just a file?

vesparny commented 3 years ago

@bilthon yes exactly, it's just an array of addresses and their corresponding indexes, nothing fancy. I didn't need any particular query so I just store everything in the async storage. So far handles hundreds of addresses in milliseconds. No big deal

KayBeSee commented 3 years ago

Still working towards this but holding off for now since querying Electrum is 10x as fast for startup now. Probably makes sense to couple with #59 still.