Closed dpc closed 1 month ago
Ping @tvolk131
@dpc do you mean you've recently added LDK (not LND)? ie you suspect LDK is the culprit
@moneyball Yes, sorry. LDK . I fixed the post.
Recently we've added LDK as one of the nodes, and now we have to wait around 60s for everything to initialize. The culprit seems to be LDK sync intervals - waiting up to 10s after every step is quite a lot. Though possibly there are other things that could be improved to enable fast startup.
If we could somehow improve these times by setting sync intervals lower, etc. it would be a big quality of life improvement.
I'm not quite sure I understand the full extent of the issue. We never 'wait' on anything besides the initial fee estimation on startup? Everything else should happen in the background. Could you expand what you're trying to do and which 'steps' are waiting on what exactly?
Secondly, if you really find yourself needing to adjust the defaults (though note they have been picked conservative for good reason as syncing via Esplora can often hit rate-limiting, especially when using public servers), you can already do so via adjusting the onchain_wallet_sync_interval_secs
, wallet_sync_interval_secs
, fee_rate_cache_update_interval_secs
config options (see https://docs.rs/ldk-node/latest/ldk_node/struct.Config.html).
It would be helpful for us if we could reduce (or eliminate) this constant:
It blocks us from setting the wallet sync interval to anything faster than 10 seconds:
I don't think this accounts for the full extent of the added delay for us, but it seems like the easiest place to start
I realize that Node::sync_wallets()
exists. That would be non-trivial for us to use though, since our Node
instance lives behind a rust interface, so we'd need to expose additional behavior in that interface. Is there anything preventing us from lowering WALLET_SYNC_INTERVAL_MINIMUM_SECS
to, say, 1 second?
It would be helpful for us if we could reduce (or eliminate) this constant:
It blocks us from setting the wallet sync interval to anything faster than 10 seconds:
Right, as the name indicates, that is its intended purpose :)
I don't think this accounts for the full extent of the added delay for us, but it seems like the easiest place to start
These operations are happening in the background, so could you clarify where you seeing any delay exactly and what you're trying to do?
I realize that
Node::sync_wallets()
exists. That would be non-trivial for us to use though, since ourNode
instance lives behind a rust interface, so we'd need to expose additional behavior in that interface.
Right, it's also mostly there for testing purposes and we more or less discourage using it to avoid spamming sync commands.
Is there anything preventing us from lowering
WALLET_SYNC_INTERVAL_MINIMUM_SECS
to, say, 1 second?
Yes, most Esplora servers (in particular public ones, but also private ones) are heavily rate-limited and will return an HTTP429 ("too many requests") whenever it's hit. Unfortunately hitting even one such error results in the sync starting over (as especially for Lightning we have to be very cautious to avoid any inconsistencies encountered during syncing), which leads to yet more requests being generated, increasing the risk of further hitting the rate-limits. So this lower-bound is essentially a safety to disallow spamming Esplora servers which results in an unusable wallet and potentially leading to security issues when no sync would ever finish.
Please also note:
a) This is really about chain syncing, i.e., syncing to a state that only changes every 10 minutes. Having an added worst-case delay of 10s on top doesn't really make much of a difference under most circumstances.
b) A lot of the syncing stuff is about to get much more efficient with the BDK upgrade, and we'll add support for sourcing full blocks (soon) and BIP157/158 filtered blocks (when kyoto
is ready), which will not be encumbered by a sync interval and are probably the way to go for your usecase anyways.
Thanks for that explanation, that makes sense.
The issue we're running into is after calling Node::connect_open_channel()
and mining 10 regtest blocks, waiting for the channel to appear when calling Node::list_channels()
with ChannelDetails.is_channel_ready == true
takes around 30-40 seconds.
Could it make sense to ignore WALLET_SYNC_INTERVAL_MINIMUM_SECS
if Config.network
is set to regtest? This way we could make sure our local esplora has rate-limiting turned off, then safely set the interval to 1 second.
Thanks for that explanation, that makes sense.
The issue we're running into is after calling
Node::connect_open_channel()
and mining 10 regtest blocks, waiting for the channel to appear when callingNode::list_channels()
withChannelDetails.is_channel_ready == true
takes around 30-40 seconds.
Ah, this is the exact use-case why we added sync_wallets
to allow triggering the sync instantly. Would using it your (CI?) script really not be an option?
Could it make sense to ignore
WALLET_SYNC_INTERVAL_MINIMUM_SECS
ifConfig.network
is set to regtest? This way we could make sure our local esplora has rate-limiting turned off, then safely set the interval to 1 second.
I guess we could consider that if you really prefer it, although I think sync_wallets
is what you want in testing: you know a new block is available, and you sync to the tip immediately.
I think
sync_wallets
is what you want in testing
Ok I can wire that up 👍 the documentation for it scared me off, but if this is the case it's meant for then I'll give it a try
I think we can close this issue for now - I'll re-open it if I run into any issues with sync_wallets
Cool, let me know if you'd prefer the sketched carve-out for regtest
after all. Closing until then.
@tnull from my testing within fedimint, it seems Node::sync_wallets()
improves the startup speed of our dev environment a bit, but doesn't allow us to bump Config.onchain_wallet_sync_interval_secs
and Config.wallet_sync_interval_secs
up to their default values (80 and 30 seconds respectively) without introducing significant slowdown.
During dev startup (just mprocs
, for Fedimint folks), we wait for lightning gateways to sync to regtest chain tip several times. I have a test branch where before each chain sync wait, we also call sync_wallets()
on the ldk-node::Node
instance. I was hoping this would allow us to change onchain_wallet_sync_interval_secs
and wallet_sync_interval_secs
to their default values, but this slows things down quite a bit.
When onchain_wallet_sync_interval_secs
and wallet_sync_interval_secs
are both set to the minimum value of 10 seconds, calling sync_wallets()
or not makes almost no difference. But when they are both set to their default values, calling sync_wallets()
cuts down startup time by ~45%. Here's some rough startup speeds of different configurations:
onchain_wallet_sync_interval_secs
& wallet_sync_interval_secs
of 10, not calling sync_wallets()
: ~55s
onchain_wallet_sync_interval_secs
& wallet_sync_interval_secs
of 10, calling sync_wallets()
: ~50s
onchain_wallet_sync_interval_secs
& wallet_sync_interval_secs
of default values (80 and 30), not calling sync_wallets()
: ~180s
onchain_wallet_sync_interval_secs
& wallet_sync_interval_secs
of default values (80 and 30), calling sync_wallets()
: ~100s
It's worth noting that in my testing, in the last case (onchain_wallet_sync_interval_secs
& wallet_sync_interval_secs
are default values and we're calling sync_wallets()
), the first sync is very slow but the subsequent ones are just a few seconds.
These results are strange to me. Clearly calling sync_wallets()
is doing something, hence the 45% speedup. But I would expect the last case above to produce results similar to the first and second cases above. Is this a bug, or am I misunderstanding how sync_wallets()
works?
During dev startup (just mprocs, for Fedimint folks), we wait for lightning gateways to sync to regtest chain tip several times. I have a test branch where before each chain sync wait, we also call sync_wallets() on the ldk-node::Node instance. I was hoping this would allow us to change onchain_wallet_sync_interval_secs and wallet_sync_interval_secs to their default values, but this slows things down quite a bit.
I'm still not quite sure what you mean by "slowdown"/"slows things down quite a bit": If you call sync_wallets
the Node
will sync the on-chain and lightning wallets and update the fee rate estimation, and the call will block until they are synced (mod syncing failures due to connection issues/ratelimiting etc., of course). So if the call returns the node will be synced to whatever the current chain tip at that specific time was, or rather what the electrs
instance actually picked up until then (as there is not only a latency for broadcasting, bitcoind
processing the transaction and adding it to the mempool, but also an additional latency for electrs
to be synced to the latest bitcoind
state). This is btw. why we use a wait_for_tx
method in our integration tests that polls with some backoff until a particular transaction we're interested is seen via Esplora before we proceed to call sync_wallets
in a few places.
But when they are both set to their default values, calling sync_wallets() cuts down startup time by ~45%.
I'm not sure what you mean with "startup" time? How do you measure that? Calling sync_wallets
definitely doesn't cut down on any processing time, it only allows to trigger syncing right when you're positive that a transaction you're waiting on has been confirmed and has been picked up by Esplora.
It's worth noting that in my testing, in the last case (onchain_wallet_sync_interval_secs & wallet_sync_interval_secs are default values and we're calling sync_wallets()), the first sync is very slow but the subsequent ones are just a few seconds.
What do you mean by 'slow'? How do you measure that exactly? Does it refer to the time stated in the logs or do you measure it in another way? And are you referring to the sync_wallets
calls, or to the background syncing?
In Fedimint we use a lot of e2e tests, where we set up a full local environment including bitcoind, electrs, bunch of lightning nodes with channels between them, and Fedimint federation. The whole thing can initialize in less than 20 seconds.
Recently we've added LDK as one of the nodes, and now we have to wait around 60s for everything to initialize. The culprit seems to be LDK sync intervals - waiting up to 10s after every step is quite a lot. Though possibly there are other things that could be improved to enable fast startup.
If we could somehow improve these times by setting sync intervals lower, etc. it would be a big quality of life improvement.