ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.85k stars 902 forks source link

lightningd: bulk query state changes and stats #7611

Closed JssDWt closed 1 month ago

JssDWt commented 2 months ago

listpeerchannels: bulk query state changes and stats

Description

State changes were fetched from the database in listpeerchannels sequentially for every channel. The performance overhead for large nodes was very high. This commit loads state changes once at the start of the listpeerchannels call, then passes the appropriate state changes to the underlying function that maps into json.

Changes Made

Checklist

Ensure the following tasks are completed before submitting the PR:

Additional Notes

This PR comes from a trace of a slow pay call on a node with many channels.

JssDWt commented 2 months ago

wondering if the channel state is changing ofter in a busy node that make this code overcomplex because you are just query the node anyway?

Did you considered to query the db with a less features RPC command where you get all the information that you need without looking at the db?

Yeah that's definitely an option. The problem I'm trying to fix is that the pay command is very slow (> 15 seconds). With tracing I have identified that listpeerchannels is the main culprit. This PR should improve the loading time of listpeerchannels, because instead of doing 2 database queries per channel sequentially, you do 2 queries at the start and use those values to pass down to invividual channel mappings.

If further performance improvements are needed, I could add flags to exclude the stats and state changes from the listpeerchannels response. In this case though, I thought I could get away with a change that doesn't include interface updates.

Just pointing out that it's not the fact that the node is busy that's the culprit here, but the amount of channels.

cdecker commented 2 months ago

Jesse and I briefly discussed what the best way to tackle this was, and we landed on a caching strategy, making the per-channel DB query into a per-listpeerchannels-request query, masssively reducing the roundtrips between node and DB.

We can of course consider alternative strategies, a) such as caching at a finer grained level, b) merging the counters back into the channels struct (as it was before we prematurely optimized...), or c) add a flag to listpeerchannels to tell it to skip any optional fields in the result, such as the stats and state transitions.

cdecker commented 2 months ago

ACK https://github.com/ElementsProject/lightning/pull/7611/commits/3ffb3884070975f912fd8b3849e713acaa4ba25d

JssDWt commented 2 months ago

added changelog and rebased

rustyrussell commented 1 month ago

We should keep the stats in the channel struct, and update the db when they change, like we do for everything else. Then listpeerchannels doesn't need the db at all. It's also far simpler.

See https://github.com/ElementsProject/lightning/pull/7679

As an aside: we should allow access to the json_filter, so json_add_channel() etc could ask "do we even need this field?" and skip unneeded calculations. But that's not justified here: this db access shouldn't even exist!

JssDWt commented 1 month ago

We should keep the stats in the channel struct, and update the db when they change, like we do for everything else. Then listpeerchannels doesn't need the db at all. It's also far simpler.

Makes sense!

JssDWt commented 1 month ago

Shall I close this in favor of #7679 @rustyrussell ?

ShahanaFarooqui commented 1 month ago

Closing it in favor of #7679