ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.81k stars 887 forks source link

Bad commit sig in splicing (CI fail) #7546

Open rustyrussell opened 1 month ago

rustyrussell commented 1 month ago

From PR https://github.com/ElementsProject/lightning/pull/7542 which has only bookkeeper fixes from master (db4a26d94 "renepay: remove unused declaration"):

2024-08-10T02:08:40.1093981Z lightningd-2 2024-08-10T01:49:25.771Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Creating commit_sig signature 1 304402206c26347629e87268478626853d596fae951b669c4ef317bf23b84e230a289c30022025c3daf898612f832475bed2eb61793957aea25a980cb87c8c3eb58c34cce18901 for tx 0200000001be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e701000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488d0b510000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57

2024-08-10T02:08:40.1136511Z lightningd-1 2024-08-10T01:49:26.035Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: billboard perm: Bad commit_sig signature 1 3044022047d8e155975f91ba5eec40b897071a5a4519e09b621a68e93b2c5b83af51a397022042be0dcc2c385f222a3ab2235bc425bc15a7a5dbb23c3f51b110b19ad435b75601 for tx 0200000001be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e701000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488d0b510000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57 feerate 3755. Cur funding be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e6, splice_info: 33b471de228279de411950e30dde2694c8e00e8a94444bb3dc27e47cd8b4e813, race_await_commit: no, inflight splice count: 0

What's weird is that these two keys and txs are the same. So why does it not like the signature?

job-logs.txt

ddustin commented 1 month ago

Time based jiggling (ie delays) would be super helpful to recreate things like this.

This smells like a timing issue related to one node thinking the splice is locked and the other not getting there for a while, and doing stuff in the meantime.

Can the dev disconnect system be used to hold on to a packet for a bit of time?

ddustin commented 1 month ago

From PR #7542 which has only bookkeeper fixes from master (db4a26d "renepay: remove unused declaration"):

2024-08-10T02:08:40.1093981Z lightningd-2 2024-08-10T01:49:25.771Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Creating commit_sig signature 1 304402206c26347629e87268478626853d596fae951b669c4ef317bf23b84e230a289c30022025c3daf898612f832475bed2eb61793957aea25a980cb87c8c3eb58c34cce18901 for tx 0200000001be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e701000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488d0b510000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57

2024-08-10T02:08:40.1136511Z lightningd-1 2024-08-10T01:49:26.035Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: billboard perm: Bad commit_sig signature 1 3044022047d8e155975f91ba5eec40b897071a5a4519e09b621a68e93b2c5b83af51a397022042be0dcc2c385f222a3ab2235bc425bc15a7a5dbb23c3f51b110b19ad435b75601 for tx 0200000001be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e701000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488d0b510000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57 feerate 3755. Cur funding be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e6, splice_info: 33b471de228279de411950e30dde2694c8e00e8a94444bb3dc27e47cd8b4e813, race_await_commit: no, inflight splice count: 0

What's weird is that these two keys and txs are the same. So why does it not like the signature?

job-logs.txt

Looking at the logs a little closer, creating commit sig & bad commit sig pulled from the log above are from different commit sig cycles -- the signature values don't match.

I found the first commit_sig failure, pasted here:

2024-08-10T02:08:40.1070209Z lightningd-2 2024-08-10T01:49:25.606Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Creating commit_sig signature 1 3044022047d8e155975f91ba5eec40b897071a5a4519e09b621a68e93b2c5b83af51a397022042be0dcc2c385f222a3ab2235bc425bc15a7a5dbb23c3f51b110b19ad435b75601 for tx 020000000133b471de228279de411950e30dde2694c8e00e8a94444bb3dc27e47cd8b4e81300000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488302f0f000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57

2024-08-10T02:08:40.1136511Z lightningd-1 2024-08-10T01:49:26.035Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: billboard perm: Bad commit_sig signature 1 3044022047d8e155975f91ba5eec40b897071a5a4519e09b621a68e93b2c5b83af51a397022042be0dcc2c385f222a3ab2235bc425bc15a7a5dbb23c3f51b110b19ad435b75601 for tx 0200000001be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e701000000009db0e280024a01000000000000220020be7935a77ca9ab70a4b8b1906825637767fed3c00824aa90c988983587d68488d0b510000000000022002091fb9e7843a03e66b4b1173482a0eb394f03a35aae4c28e8b4b1f575696bd7939b3ed620 wscript 522102324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b2102e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5752ae key 02e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e57 feerate 3755. Cur funding be0db33eb76ce5458bf2c9e1c080908fe900fc802b3b4decbfa53eb7b26410e6, splice_info: 33b471de228279de411950e30dde2694c8e00e8a94444bb3dc27e47cd8b4e813, race_await_commit: no, inflight splice count: 0

Looking into it, the tx's do indeed change for this cycle. The difference is the channel funding outpoint.

image image