jamaljsr / polar

One-click Bitcoin Lightning networks for local app development & testing
https://lightningpolar.com
MIT License
737 stars 138 forks source link

Bug: Unable to run CLN on macOS Vetura (13.5.2) #779

Open sr-gi opened 9 months ago

sr-gi commented 9 months ago

Trying to run any network including a CLN node (of any of the available versions for Polar 2.0.0) results in the CLN node not bootstraping.

Checking the logs of any of the CLN nodes, the same error pops up:

2023-09-15 14:46:51 (node:48) UnhandledPromiseRejectionWarning: Error: EINVAL: invalid argument, stat '/home/clightning/.lightning/regtest/lightning-rpc'
2023-09-15 14:46:51     at Object.statSync (fs.js:1132:3)
2023-09-15 14:46:51     at fStat (/opt/c-lightning-rest/lightning-client-js.js:15:28)
2023-09-15 14:46:51     at new LightningClient (/opt/c-lightning-rest/lightning-client-js.js:28:35)
2023-09-15 14:46:51     at module.exports (/opt/c-lightning-rest/lightning-client-js.js:171:29)
2023-09-15 14:46:51     at Object.<anonymous> (/opt/c-lightning-rest/app.js:47:45)
2023-09-15 14:46:51     at Module._compile (internal/modules/cjs/loader.js:1114:14)
2023-09-15 14:46:51     at Object.Module._extensions..js (internal/modules/cjs/loader.js:1143:10)
2023-09-15 14:46:51     at Module.load (internal/modules/cjs/loader.js:979:32)
2023-09-15 14:46:51     at Function.Module._load (internal/modules/cjs/loader.js:819:12)
2023-09-15 14:46:51     at Module.require (internal/modules/cjs/loader.js:1003:19)
2023-09-15 14:46:51 (Use `node --trace-warnings ...` to show where the warning was created)
2023-09-15 14:46:51 (node:48) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
2023-09-15 14:46:51 (node:48) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

This seems to be related to c-lightning-rest, but that's as far as I can go.

As a side note, I updated to Vetura (13.5.2) yesterday, and Polar was working fine with CLN nodes before that.

sr-gi commented 8 months ago

I've been digging a bit into this trying to figure out what was going on. Looking at the node error, looks like the following command is failing:

stat /home/clightning/.lightning/regtest/lightning-rpc'

That seemed odd to me, why would stat fail?, so I execed into the container and tried to manually stat, yielding to:

stat: cannot statx '/home/clightning/.lightning/regtest/lightning-rpc': Invalid argument

I checked that the file was there, and it indeed was, but something is odd with the file:

> root@bob:/home/clightning/.lightning/regtest# ls -la
ls: cannot access 'lightning-rpc': Invalid argument
total 308
drwx------ 14 clightning clightning    448 Oct 16 18:04 .
drwxr-xr-x  4 clightning clightning    128 Oct 16 17:30 ..
-rw-r--r--  1 clightning clightning  36864 Oct 16 18:04 accounts.sqlite3
-rw-r--r--  1 clightning clightning    246 Oct 16 17:30 ca-key.pem
-rw-r--r--  1 clightning clightning    568 Oct 16 17:30 ca.pem
-rw-r--r--  1 clightning clightning    246 Oct 16 17:30 client-key.pem
-rw-r--r--  1 clightning clightning    510 Oct 16 17:30 client.pem
-r--------  1 root       root           57 Oct 16 17:30 emergency.recover
-rw-------  1 root       root            1 Oct 16 18:04 gossip_store
-r--------  1 clightning clightning     32 Oct 16 17:30 hsm_secret
s?????????  ? ?          ?               ?            ? lightning-rpc
-rw-r--r--  1 clightning clightning 241664 Oct 16 18:04 lightningd.sqlite3
-rw-r--r--  1 clightning clightning    246 Oct 16 17:30 server-key.pem
-rw-r--r--  1 clightning clightning    510 Oct 16 17:30 server.pem

So I figured it may be worth trying to run without c-lightning-rest and see if the issue persisted. It did. Checking the docker logs after disabling the plugin I noticed something though, there was an issue when trying to change the ownership of some of the files within the .lightning folder, including lightning-rpc:

2023-10-16 14:04:46 chown: changing ownership of '/home/clightning/.lightning/regtest/emergency.recover': Permission denied
2023-10-16 14:04:46 chown: changing ownership of '/home/clightning/.lightning/regtest/lightning-rpc': Invalid argument
2023-10-16 14:04:46 chown: changing ownership of '/home/clightning/.lightning/regtest/hsm_secret': Permission denied

This, to my understanding, is part of the entrypoint script to CLN, but that's as far as I've been able to get. Not sure what is causing the issue :/

jamaljsr commented 8 months ago

Thank you for all of these troubleshooting details you've provided. I'm not sure exactly what changed that would cause this to break. I suspect it's either MacOS or Docker, but hard to tell for sure. I'm not sure when I'll have the bandwidth to troubleshoot this myself. If you gather any more info on this, please do share as it'll be helpful in getting to the bottom of this.

oneforalone commented 8 months ago

After specifying the rpc socket file to another path in container, it works. Seems some issues between socket files and docker's volume. However, i have no idea about tracing the root of this issue. Here is a temporary solution:

  1. Create a network and DO NOT start it.
  2. Open a terminal and change to ~/.polar/networks and find the correct number, if you got only one network, there's just one directory name 1:
    cd ~/.polar/networks/1

    If you got multiple networks, just change the last number with the biggest one.

  3. Edit the docker-compose.yml file, locate the clightning configuration, append --rpc-file=../../lightning-rpc in the end of the command field.
    alice:
    environment:
      USERID: ${USERID:-1000}
      GROUPID: ${GROUPID:-1000}
    stop_grace_period: 2m
    image: polarlightning/clightning:23.05.2
    container_name: polar-n1-alice
    hostname: alice
    command: >-
      lightningd --alias=alice --addr=alice --addr=0.0.0.0:9735
      --network=regtest --bitcoin-rpcuser=polaruser
      --bitcoin-rpcpassword=polarpass --bitcoin-rpcconnect=polar-n1-backend1
      --bitcoin-rpcport=18443 --log-level=debug --dev-bitcoind-poll=2
      --dev-fast-gossip --grpc-port=11001 --log-file=-
      --log-file=/home/clightning/.lightning/debug.log
      --plugin=/opt/c-lightning-rest/plugin.js --rest-port=8080
      --rest-protocol=http --rpc-file=../../lightning-rpc
  4. start the network.

Maybe, on Polar sides, we can change the templates of CLN's configure to avoid this issue.

sr-gi commented 8 months ago

Nice workaround!

You can also do this from the GIU, simply editing the configuration of the CLN node and modifying the rpc-file in the same way. That is, Select node -> Actions -> Edit options -> Click on Pre-fill with default command -> Add the rpc-file manually

jamaljsr commented 8 months ago

Thanks a bunch for sharing the workarounds. It looks like we can no longer mount the unix socket via the Docker volumes. This seems like an easy fix to implement directly in the codebase. I'll work on this when time permits.

sr-gi commented 8 months ago

To add to the workarounds, it may be worth to also define a simlink to where lightning-cli expects to find your rpc-file, otherwise CLI calls will need to append the rpc-file param, which is a bit annoying, for that run (as the clightning user):

ln -s /home/clightning/.lightning/regtest/lightning-rpc  /home/clightning/lightning-rpc
jamaljsr commented 8 months ago

My initial thought was to just update the alias command that's automatically run when the Terminal is opened.

alias lightning-cli="lightning-cli --network regtest --rpc-file=/home/clightning/lightning-rpc"
sr-gi commented 8 months ago

My initial thought was to just update the alias command that's automatically run when the Terminal is opened.

alias lightning-cli="lightning-cli --network regtest --rpc-file=/home/clightning/lightning-rpc"

That works, but it may break things outside the container that expect to find the rpc-file in the default location based on the network.

I'm not saying that is a good practice, but I bet some projects do rely on it.

oneforalone commented 8 months ago

You can also do this from the GIU, simply editing the configuration of the CLN node and modifying the rpc-file in the same way. That is, Select node -> Actions -> Edit options -> Click on Pre-fill with default command -> Add the rpc-file manually

That's cool, cause i'm just new to Polar, not noticed this feature. It's better than editing docker-compose.yml file.

One thing I forget to mention is that you can not specify the absolute path with --rpc-file, relative path is required when launching lightningd, which means it must start with ../../ to get rid of the docker mounted volume. This maybe CLN's issue, the --rpc-file option did not handle the argument properly.

johngribbin commented 7 months ago

That works, but it may break things outside the container that expect to find the rpc-file in the default location based on the network.

I'm not saying that is a good practice, but I bet some projects do rely on it.

Yeah I think this is happening for me. I am trying to run polar alongside sim-ln (to generate 1000s of transactions on nodes for testing).

Im getting No such file or directory error when i try to start up sim-ln. This is the same error I see when I try to use the Terminal in polar to perform cli commands.

sr-gi commented 4 months ago

Just FYI, the patch to fix the default rpc-file on cln-grpc was recently merged, however, they haven't released a new version of the crate yet.