ElementsProject / lightning

Core Lightning — Lightning Network implementation focusing on spec compliance and performance
Other
2.85k stars 902 forks source link

All attempts to open channels fail mysteriously #60

Closed gwillen closed 8 years ago

gwillen commented 8 years ago

I'm not much help on the why of this one, but every attempt I make to open a channel is failing, here's an example.

The testnet bitcoind I'm talking to has been running for a long time, ignored by me, but appears to be up to date with the chain (it's on block 1012853, which matches some random explorer I looked at.)

(Note: This is not the first failed attempt I've made with this particular transaction, so it's possible that this is not the real failure, and this failure is actually "you already used this transaction" or "you already broke something with a previous failure", in which case the bug is that it doesn't say so.)

./daemon/lightning-cli connect 128.199.80.48 8334 010000000108de68e965b8b89d3faacbb8be39f98bbe8beec84645ee834fbd5b85df79005a010000006b48304502210090db29ed1b2b7555b94bf863ece893fc588fe6632ed789c2042a42f68071bcfd022069cc5884d20409d8abf339fb1438a423c483ea059a5aa5b5f67f1789200bcb72012103417a853d089af430f8ec6593d94daa7d98842f467ff1439bf8ef6ecae59f1aebfeffffff0240420f000000000017a914e13ebe92e2e2290c88497a7d8e635485217733a68738700105000000001976a9147ad3816abe25f6932738a9a7d65d681c10db3c9b88ac0d740f00 [hangs]

lightningd says:

OUT-128.199.80.48:8334: Connected out to 128.199.80.48:8334 id 02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706, changing prefix 02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706: Entered error state STATE_ERR_INTERNAL 02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706: Peer breakdown: nothing to do 02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706: Disconnected

getlog debug says:

{ "type" : "INFO", "time" : "99.399146956", "source" : "OUT-128.199.80.48:8334:", "log" : "Connected out to 128.199.80.48:8334 id 02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706, changing prefix" }, 
    { "type" : "DEBUG", "time" : "99.399173959", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Using fee rate 252000" }, 
    { "type" : "DEBUG", "time" : "99.399516067", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_create_peer(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.399524245", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_start_transaction(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.399757572", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_commit_transaction(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.408130983", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Queued pkt PKT__PKT_OPEN (order=0)" }, 
    { "type" : "DEBUG", "time" : "99.408137194", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "CMD_OPEN_WITH_ANCHOR: STATE_INIT => STATE_OPEN_WAIT_FOR_OPEN_WITHANCHOR (out PKT__PKT_OPEN)" }, 
    { "type" : "DEBUG", "time" : "99.408148105", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "pkt_out: writing PKT__PKT_OPEN" }, 
    { "type" : "DEBUG", "time" : "99.615157245", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Decrypted header len 5" }, 
    { "type" : "DEBUG", "time" : "99.615197380", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Received packet LEN=5, type=PKT__PKT_RECONNECT" }, 
    { "type" : "DEBUG", "time" : "99.615203960", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "CMD_SEND_COMMIT: STATE_OPEN_WAIT_FOR_OPEN_WITHANCHOR => STATE_ERR_INTERNAL" }, 
    { "type" : "BROKEN", "time" : "99.615214327", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Entered error state STATE_ERR_INTERNAL" }, 
    { "type" : "INFO", "time" : "99.615229732", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Peer breakdown: nothing to do" }, 
    { "type" : "DEBUG", "time" : "99.615235671", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "peer_breakdown: STATE_ERR_INTERNAL => STATE_CLOSED" }, 
    { "type" : "DEBUG", "time" : "99.615247719", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_forget_peer(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.615261653", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_start_transaction(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.615714001", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "db_commit_transaction(02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706)" }, 
    { "type" : "DEBUG", "time" : "99.620063221", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "pkt_out: no IO possible, closing" }, 
    { "type" : "INFO", "time" : "99.620095618", "source" : "02915506c736ffec49ad58fc021779600dcd2b7a52ac97690571aea5b4d9be2706:", "log" : "Disconnected" }, 
cdecker commented 8 years ago

Looks like some residual state on the cat server is aborting the connection on your side. My guess is that the server remembers you while your node has no idea why it is attempting a reconnect.

@rustyrussell: can you check on the server's side?

gwillen commented 8 years ago

Note that this doesn't only happen on the cat server; it happens identically on the lightning clicker server: http://159.203.218.14:8000/

(Although in both cases I didn't inspect the very first failure, so it's possible this is only a secondary failure.)

cdecker commented 8 years ago

Ok, that's strange indeed, I also just checked that the outputs you used to fund the channel is still available.

Maybe a dev-crash can help here?

BTW: thanks for pointing the game out, opening a channel as we speak :+1:

cdecker commented 8 years ago

I reproduced the problem without trying to. It would appear that the nodes became disconnected and re-connecting resulted in an erroneous packet from the server, triggering the unilateral close.

Here's a filtered log (without chaintips and subprocesses).

rustyrussell commented 8 years ago

gwillen notifications@github.com writes:

I'm not much help on the why of this one, but every attempt I make to open a channel is failing, here's an example.

Looks like your version is old?

The protocol changed to always send a PKT_RECONNECT. Which is now called PKT_INIT.

Cheers, Rusty. PS. That's not to say there aren't other problems :)

gwillen commented 8 years ago

Hm, it seems that I was on branch "debug-log" which has not been updated to keep up with master. Should I switch to master or do you want to rebase debug-log?

Thanks!

gwillen commented 8 years ago

Indeed it seems that on branch master, this problem goes away (being replaced with other problems.) Thanks! I will leave this open for you to close in case you want to answer my question about the fate of the debug-log branch first.

rustyrussell commented 8 years ago

Let me delete that old branch...