jamesog / tailscale-edgeos

Running Tailscale on Ubiquiti EdgeOS
MIT License
327 stars 29 forks source link

"No Such File Or Directory" On Step 2 of "Installing Tailscale" section of readme #34

Closed JT-theDM closed 8 months ago

JT-theDM commented 9 months ago

Not sure if this was just my issue but I wasn't able to install tailscale following your instructions. I believe I followed the instructions accurately up through running the first script at \config\scripts\firstboot.d\tailscale.sh when i tried to run the next script I got a "no such file or directory" error.

Router: EdgeRouter X OS: EdgeOS v2.0.9-hotfix.7 notes: this is the first thing I did after factory reset + config via the basic setup wizard. connected over SSH from a windows11 cmd line.

Steps to replicate:

  1. Follow steps 1 and 2 in the readme
  2. hit enter on the last line from step 2 " /config/scripts/post-config.d/tailscale.sh"

I was able to get this installed after some troubleshooting. I think the cause for the failure has to do with the certificate fix added in the last update to the script.

I started troubleshooting by trying to verify that the \config\tailscale\ path was created, which is one of the first step after the certs fix and the easiest thing to verify that any part of the script was successful The tailscale path had not been created despite not getting an error when running \config\scripts\firstboot.d\tailscale.sh.

Next I decided to run certificate fix outside of the script to see if I got an error and everything worked as it should despite the lines being identical.

Then I edited \config\scripts\firstboot.d\tailscale.sh to comment out the certificate fix section. the only other edit i made on the file was to change "set -e" to "set -x" (the reason for that is I've got no real experience with shell and based on some quick + lazy googling I was hoping that would let me see more info on the output of the script so I could look for any errors)

start of the file after the edit was

#!/bin/sh

set -x

#sed -i 's|^mozilla\/DST_Root_CA_X3\.crt|!mozilla/DST_Root_CA_X3.crt|' /etc/ca-certificates.conf
#curl -sk https://letsencrypt.org/certs/isrgrootx1.pem -o /usr/local/share/ca-certificates/ISRG_Root_X1.crt
#update-ca-certificates --fresh

the rest of the file was unchanged.

Running the script again after those edits it worked the first time I ran it so unless nobody else can replicate this issue i think there may be an issue with the new section of the tailscale script

Please let me know if anyone is able to recreate this issue. attaching the SSH lines if anyone feels the need to check it for an embarrassing mistake on my part

image

jamesog commented 9 months ago

Hey @JT-theDM, thanks for the incredibly detailed report! That's all very helpful.

It does sound like the changes for the certificate fix are causing set -e (which tells bash to exit when any command has an error) to exit the script early. An improvement here would be to add a trap to the script so it can output the error. It's odd that none of the commands are outputting an error of their own, though.

@haozhun would you mind taking a look at this as you said you tested the changes? Thanks!

haozhun commented 9 months ago

In my original PR, I mentioned that I've verified that the fix is idempotent.

However, in reality, I did actually run into one issue when I repeatedly ran the steps. I couldn't reproduce it, so I chalked it up to me accidentally doing something stupid. However, it looks like that is not the case, considering that this report matched exactly the error that I ran into.

The issue is that /usr/local/share/ca-certificates/ directory somehow does not exist. As a result, curl fails.

It's odd that none of the commands are outputting an error of their own, though.

Indeed. Your commend reminded me to take a look at the curl options. Normally, one would do -sS in a script, so that curl is mostly silent if everything goes smoothly, but still print errors if something bad happens. In this case, I copy-pasted the command from the https://community.ui.com/questions/Fix-Solution-Lets-Encrypt-DST-Root-CA-X3-Expiration-Problems-with-IDS-IPS-Signature-Updates-HTTPS-E/0404a626-1a77-4d6c-9b4c-17ea3dea641d, and I didn't think about it. In this case, curl is ran with -s, so it's completely silent, which is a bad idea in general, and especially so in a script.

As I set out to change -s to -sS, I realize that the command is ran with -k (short for --inscecure). This bypasses https cert validation. That is a terrible terrible thing to put in this repo. It is understandable why it's there in the first place. The letsencrypt.org domain uses a TLS cert that is recursively signed by ISRG_Root_X1, therefore we need to break the circular dependency somehow. And adding -k is the simplest solution. Nevertheless, it is terrible.

Therefore, I request that you revert my previous commit.

I can think of two paths forward here:

  1. (This is preferred if possible.) Is there a trusted 3rd party that hosts this pem file? If so, we should use that.
  2. We can include the cert file in this repo. That means that all users are trusting you not to put some rogue cert file in here (so that you can MITM them). But if anyone is running your script anyways, they're probably trusting you already. And anyone who cared to read the script before running them can also check the file's authenticity by themselves.
jamesog commented 9 months ago

Thank you very much for this detailed analysis!

You're right: -k is the simplest, but it's also terrible.

I think it would be acceptable to have a copy of the ISRG X1 root certificate in this repo and install it that way. It's a public certificate so it's easily verifiable.

Pengozoid commented 9 months ago

Tried to install today to EdgeRouter 4 with EdgeOSv2.0.9-hotfix.7

Changed the curl command to this:

curl https://letsencrypt.org/certs/isrgrootx1.pem --create-dirs -o /usr/local/share/ca-certificates/ISRG_Root_X1.crt

Installed successfully.

johantiden commented 9 months ago

I removed set -e from and it continued successfully. curl seems to work fine but returns non-zero, so the rest of execution is stopped.

Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

jamesog commented 9 months ago

In case anyone's still got a fresh enough system that can exhibit the problem can you help me verify if the curl is even needed?

On my ER4 it seems like the ISRG Root X1 cert is already present:

root@gw02:~# ls -l /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt
-rw-r--r--    1 root     root          1939 Jun  5  2020 /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt

And, as far as I can tell on this system (which is not a clean system), the sed and update-ca-certificates commands are enough, without fetching the newer X1 cert. On this system, the cert in /usr/share is identical to the one downloaded with curl.

If we can get away with removing the curl command from the setup script that will fix this.

johantiden commented 9 months ago

I have that file as well

Linux EdgeRouter-4 4.9.79-UBNT #1 SMP Thu Jun 15 11:34:36 UTC 2023 mips64
Welcome to EdgeOS
Last login: Tue Jan 23 13:48:02 2024 from 192.168.1.45

admin@EdgeRouter-4:~$ ls -l /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt
-rw-r--r--    1 root     root          1939 Jun  5  2020 /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt

Notice also that I don't have the file specified in the curl download:

admin@EdgeRouter-4:~$ ls /usr/local/share/ca-certificates/ISRG_Root_X1.crt
ls: /usr/local/share/ca-certificates/ISRG_Root_X1.crt: No such file or directory

I guess curl failed because the folder /usr/local/share/ca-certificates doesn't exist.

jorgen-k commented 8 months ago

Removed the curl completely and worked fine. EdgeRouter X v2.0.9-hotfix.7

jamesog commented 8 months ago

Thanks for that confirmation. I've committed a change that removes the curl.