ZmnSCPxj / clboss

Automated Core Lightning Node Manager
MIT License
211 stars 32 forks source link

plugin-clboss: Killing plugin: JSON-RPC response "id"-field is not a u64 #47

Closed hosiawak closed 3 years ago

hosiawak commented 3 years ago

My clboss got killed after running for a few days. The message in the log was:

plugin-clboss: Killing plugin: JSON-RPC response "id"-field is not a u64

Unfortunately I didn't have DEBUG logs enabled for clboss.

The log message prior to the error was:

lightningd: Sending 279628174msat over 7 hops to deliver 278212722msat

hosiawak commented 3 years ago

Also now getting this in the console (not in the log), not sure if this is related but I haven't seen it before:

 Unhandled exception in concurrent task!
Sqlite3::Tx:
            DELETE FROM "OnchainFeeMonitor_samples"
             ORDER BY id
             LIMIT :limit
                 ;
            : near "ORDER": syntax error
...main loop continuing...
ZmnSCPxj commented 3 years ago

The second issue looks like a problem with the SQLITE on your system; there is a build option for SQLITE to enable this syntax: https://sqlite.org/compile.html#enable_update_delete_limit . Looks like it would be a good idea to move SQLITE3 inside external/ in order to ensure we have control of the SQLITE3 flags; on Linux systems I have seen, the flag is enabled and ORDER BY is understood by the system SQLITE3.

Can you try this on your system?

$ sqlite3 :memory:
sqlite> CREATE TABLE "Foo"(id INTEGER PRIMARY KEY);
sqlite> DELETE FROM "Foo" ORDER BY id LIMIT 1;

(exit sqlite3 by ^D or .quit)

If you get the same error then it seems the system SQLITE3 does not enable the flag. If so, a workaround you can try would be to compile SQLITE3 from source, and make sure the flag is enabled (there are probably a bunch of flags that need to be enabled, like foreign key flags), creating a library. Then ./configure in CLBOSS, indicating the appropriate SQLITE3_CFLAGS and SQLITE3_LIBS:

env SQLITE3_CFLAGS=-I/path/to/sqlite3 SQLITE3_LIBS=-l/path/to/sqlite3/libsqlite3.a ./configure && make

(The above is untested, merely theoretical, sorry; it "should" work and I use a fairly commonly-used SQLITE3 setup macro for autoconf, so it "should" work, but I have not tried this myself)

You can see more information on ./configure --help.


For the id error, it is probably because I was an idiot and pass JSON numbers through double. Sigh. Lemme try something that should help fix this. This would be tested by running a script that continuously does lightning-cli getinfo to stress-test the id.

Unfortunately I will be offline for a few days for some internal garbage-collection processes, but lemme try to push a quick fix for the id issue on master (though you do have problems with autoreconf -i too right.....).

ZmnSCPxj commented 3 years ago

Hi @hosiawak , pushed a quick fix, hope it works. Try this package, ignore the version number for now: clboss-0.9A.tar.gz It includes fixes for both the id and the DELETE..ORDER BY issues. I have it running for like 5 minutes now (LOL).

ZmnSCPxj commented 3 years ago

Running for an hour now, seems okay.

ZmnSCPxj commented 3 years ago

Still up, so at least the bugfix did not introduce yet another bug.... probably only question is if it actually fixes the bug.

ZmnSCPxj commented 3 years ago

Going offline. Bye!

ZmnSCPxj commented 3 years ago

I left my C-Lightning node running, but it got hit by https://github.com/ElementsProject/lightning/issues/4240 since I was using logrotate. Hmmm. Lemme run a stress-test.

hosiawak commented 3 years ago

I left it running (without using your modified version) and it got killed 2 days ago for the same reason:

2020-11-30T18:32:41.193Z INFO    plugin-clboss: FundsMover: Failed at source, cannot advance further.
2020-11-30T18:32:42.302Z INFO    plugin-clboss: Killing plugin: JSON-RPC response "id"-field is not a u64

I'm going to compile the version you posted above and let that run for a while. Thanks.

hosiawak commented 3 years ago

2 days later clboss is still running so I guess this issue is now fixed :)

ZmnSCPxj commented 3 years ago

Okay, let us close?