Start9Labs / cln-startos

wrapper for building c-lightning.s9pk
Other
6 stars 10 forks source link

[bug]: Fresh Install stuck in `Waiting for system cert key file...` #130

Open storopoli opened 3 months ago

storopoli commented 3 months ago

Prerequisites

Device

Laptop/Desktop

Device OS

Other

Device OS Version

startos

Browser

Firefox

Browser Version

127

Current Behavior

Installed CLN from scratch and now it is stuck in

Waiting for system cert key file...

I did a search and found that I am hitting this line: https://github.com/Start9Labs/cln-startos/blob/29f0fe7924543e7689026ad0194cd3beb38a1048/docker_entrypoint.sh#L106-L111

I have no idea what's going on. My start9 /mnt dir does not have anything. Shouldn't CLN generate a system cert key file?

Cc @chrisguida since 48e6bbbf5d3c743f5c701567f6aa2b1dd9981e3d introduce these lines.

Expected Behavior

CLN should start from a fresh install in StartOS and not be stuck in a infinite loop.

Steps to Reproduce

  1. Install CLN from scratch
  2. Run and see the logs

Anything else?

image

storopoli commented 3 months ago

Update, after fiddling around I've discovered that the cert directory in startos is:

pub const PACKAGE_CERT_PATH: &str = "/var/lib/embassy/ssl"; (https://github.com/Start9Labs/start-os/blob/fc8b1193de618efe3c9fe9f68ed9c7ce23cd562f/core/startos/src/net/mod.rs#L23).

Then:

cp /embassy-data/package-data/volumes/c-lightning/data/main/bitcoin/server-key.pem /var/lib/embassy/ssl/c-lightning/rest/rest.key.pem
cp /embassy-data/package-data/volumes/c-lightning/data/main/bitcoin/server.pem /var/lib/embassy/ssl/c-lightning/rest/rest.cert.pem

seems to fix. Maybe the docker_entrypoint.sh should be made robust to that kind of scenario?

chrisguida commented 3 months ago

@storopoli I've stopped maintaining this package, but I'm sure @Dominion5254 would accept a PR to fix :)

chrisguida commented 3 months ago

FTR this is probably an edge case on your particular hardware, I haven't ever seen this on any of my installs. But yeah I'm sure my code is not watertight :)

Dominion5254 commented 3 months ago

I don't believe I have seen this before. Is there anything unique about your setup that might give us some clues as to why /mnt/cert is empty? StartOS version? CLN version? Hardware running StartOS?

It is also worth mentioning that c-lightning-REST is slated to be deprecated in the near future in favor of clnrest, cln-grpc, commando, etc. So it might not make sense to add a find and patch a seemingly remote edge case for a soon to be deprecated connection interface.

storopoli commented 3 months ago

Start9 Server One 2023 with Celeron N4505 Version 0.3.5~1

This started happening after I've chrooted and created some systemd socat services following https://community.start9.com/t/core-lightning-with-tor-and-ipv4-clearnet/965

I've also added the following lines to config.main:

announce-addr=<CLEARNET_DNS_DOMAIN>:9735
announce-addr-dns=true
chrisguida commented 3 months ago

Ahh yeah that's definitely a hack. I can't immediately think of a reason that would break the container, but doing stuff like that definitely 'voids your warranty" :p

Have you tried just turning that off and seeing if it works? The container is not expecting the config.main file to be changed by the user.

storopoli commented 3 months ago

Yes tried restoring from a backup, wiping the /embassy-data/.../c-lightning folder. Tried from a brand new node. Unfortunately I was forced to move to LND after 1 solid year in CLN. I will probably wait for 0.3.6 that will arrive in $[14, \infty)$ days.

Feel free to close this if you cannot reproduce it.

chrisguida commented 3 months ago

It's doing the same thing from a fresh node with nothing in the datadir? That doesn't make any sense...

storopoli commented 3 months ago

Yes, because it does not have the /mnt/cert/ folder somehow

Dominion5254 commented 3 months ago

The direct cause isn't immediately apparent to me either, but the clearnet hack almost certainly seems to be the culprit for the original issue. While it is great to see users doing cool DIY hacks, going under the hood to make changes such as this or resolving resulting issues is of course not something Start9 can officially support.

But I am confused by why this would occur with "a brand new node" - by this do you mean re-installing CLN on the same box that you had chrooted, or a completely fresh install of StartOS?

chrisguida commented 3 months ago

I mean this is an OS issue if the cert volume isn't getting mounted on a fresh new install of CLN on a brand new start-os server.

Dominion5254 commented 3 months ago

Agreed, if that is the case it warrants opening an issue on the StartOS repo, but I want to make sure that is in fact what @storopoli meant by "a brand new node".