lightningd / plugins

Community curated plugins for core-lightning
BSD 3-Clause "New" or "Revised" License
269 stars 129 forks source link

Backup plugin is squirrely with SSHFS and SQM enabled via router #473

Closed v4w5gse65g874 closed 1 year ago

v4w5gse65g874 commented 1 year ago

Lightning version: v23.05.2 Plugins commit id: ce078bb

Issue: I was having issues with bufferbloat on my network and enabled SQM via OpenWRT. I have noticed since then that after a few days my lightning node would crash because the backup plugin seems to ... potentially ... timeout. I have the sql file connected via an sshfs mount to a digital ocean droplet. I suspect that during some bloat the write attempt with the backup plugin needs to "wait" and it must be sensitive to needing to wait and timesout immediately, causing the backup plugin to fatal causing the node to crash. I must state that, sadly, I have updated my lightning node and plugins repo at the same time I enabled SQM, so I'm not 100% confident this issue is due to SQM.

Logs:

Plugin '/root/repos/plugins/backup/backup.py' returned an invalid response to the db_write hook: {"jsonrpc": "2.0", "id": "cln:db_write#20075", "error": {"code": -32600, "message": "Error while processing db_write: [Errno 5] Input/output error: '/mnt/lightning_node_channel_backups/lightningd.sqlite3.bkp'", "traceback": "Traceback (most recent call last):\n File \"/root/repos/lightning/contrib/pyln-client/pyln/client/plugin.py\", line 639, in _dispatch_request\n result = self._exec_func(method.func, request)\n File \"/root/repos/lightning/contrib/pyln-client/pyln/client/plugin.py\", line 619, in _exec_func\n ret = func(*ba.args, **ba.kwargs)\n File \"/root/repos/plugins/backup/backup.py\", line 66, in on_db_write\n if plugin.backend.add_change(change):\n File \"/root/repos/plugins/backup/filebackend.py\", line 73, in add_change\n with open(self.url.path, 'ab') as f:\nOSError: [Errno 5] Input/output error: '/mnt/lightning_node_channel_backups/lightningd.sqlite3.bkp'\n"}}

2023-08-09T10:04:28.061Z BROKEN lightningd: Plugin '/root/repos/plugins/backup/backup.py' returned an invalid response to the db_write hook: {"jsonrpc": "2.0", "id": "cln:db_write#20075", "error": {"code": -32600, "message": "Error while processing db_write: [Errno 5] Input/output error: '/mnt/lightning_node_channel_backups/lightningd.sqlite3.bkp'", "traceback": "Traceback (most recent call last):\n File \"/root/repos/lightning/contrib/pyln-client/pyln/client/plugin.py\", line 639, in _dispatch_request\n result = self._exec_func(method.func, request)\n File \"/root/repos/lightning/contrib/pyln-client/pyln/client/plugin.py\", line 619, in _exec_func\n ret = func(*ba.args, **ba.kwargs)\n File \"/root/repos/plugins/backup/backup.py\", line 66, in on_db_write\n if plugin.backend.add_change(change):\n File \"/root/repos/plugins/backup/filebackend.py\", line 73, in add_change\n with open(self.url.path, 'ab') as f:\nOSError: [Errno 5] Input/output error: '/mnt/lightning_node_channel_backups/lightningd.sqlite3.bkp'\n"}} lightningd: FATAL SIGNAL 6 (version v23.05.2)

v4w5gse65g874 commented 1 year ago

I have created a workaround via a script. I'm going to close this task out; if someone else has this issue I'll leave the script here. Note that I realize there's a service script in the CLN repo; but my installation is unique and I didn't want to hack the service script up.



while true
do
        if ! pgrep -x "lightningd" > /dev/null
        then
                /usr/local/bin/lightningd
        fi
done