openziti / ziti

The parent project for OpenZiti. Here you will find the executables for a fully zero trust, application embedded, programmable network @OpenZiti
https://openziti.io
Apache License 2.0
2.82k stars 159 forks source link

quickstart overwrites existing ziti-edge-router.yaml #1328

Closed mvelbaum closed 1 year ago

mvelbaum commented 1 year ago

I've been playing with the docker compose setup and it seems that the edge router yaml file gets overwritten when making changes to the yaml or .env file. I notice that ziti-edge-router-XXXXXX.init files are being created in the persistent folder.

This is quite annoying as I've been manually editing the YAML file to add support to browZer and my changes get overwritten.

dovholuknf commented 1 year ago

Hi @mvelbaum, yeah that is 100% annoying... What version of the container are you running? That was a bug that made it a pain to modify the container after it started...

@gberl002, I thought we fixed this bug? You can change over to using the 'non-quickstart' (no-frills) container if you want. I can show you how to move/migrate to that if you want.

dovholuknf commented 1 year ago

looks like this is already fixed but won't be released until 0.30.4. I found the pr that it was fixed in: https://github.com/openziti/ziti/pull/1296/files

mvelbaum commented 1 year ago

Hey @dovholuknf, I'm using the latest image (ID bdff7d5d68e0) - seems like it was created 10 days ago. I would love to switch my current compose.yml to a non-quickstart setup :)

Edit: woops, looks like you've posted right before me :)

dovholuknf commented 1 year ago

No worries. I'll show you how to migrate from the quickstart container to the 'no frills, just run ziti router' container. gimme a few and i'll post back :)

dovholuknf commented 1 year ago

assuming you follow along with the docker, no compose quickstart here: https://openziti.io/docs/learn/quickstarts/network/local-with-docker

You would have made a persistent volume for your files and you would have started the controller with:

docker run \
  --name ziti-controller \
  -e ZITI_CTRL_ADVERTISED_ADDRESS=ziti-edge-controller \
  --network myFirstZitiNetwork \
  --network-alias ziti-controller \
  --network-alias ziti-edge-controller \
  -p 1280:1280 \
  -it \
  --rm \
  -v myPersistentZitiFiles:/persistent \
  openziti/quickstart \
  /var/openziti/scripts/run-controller.sh

You could change that to use the "non quickstart" controller by running:

docker run \
  --name ziti-controller \
  --network myFirstZitiNetwork \
  --network-alias ziti-controller \
  --network-alias ziti-edge-controller \
  -p 1280:1280 \
  -it \
  --rm \
  -v myPersistentZitiFiles:/persistent \
  openziti/ziti-controller:0.30.3 \
  run /persistent/ziti-controller.yaml

Then, you can do the same thing but with the ziti-router. After starting the router with:

docker run \
  --name ziti-edge-router-1 \
  -e ZITI_ROUTER_NAME=ziti-edge-router-1 \
  -e ZITI_ROUTER_ADVERTISED_ADDRESS=ziti-edge-router-1 \
  -e ZITI_ROUTER_ROLES=public \
  --network myFirstZitiNetwork \
  --network-alias ziti-edge-router-1 \
  -p 3022:3022 \
  -it \
  --rm \
  -v myPersistentZitiFiles:/persistent \
  openziti/quickstart \
  /var/openziti/scripts/run-router.sh edge

Then you just flip that to be:

docker run \
  --name ziti-edge-router-1 \
  -e ZITI_ROUTER_NAME=ziti-edge-router-1 \
  -e ZITI_ROUTER_ADVERTISED_ADDRESS=ziti-edge-router-1 \
  -e ZITI_ROUTER_ROLES=public \
  --network myFirstZitiNetwork \
  --network-alias ziti-edge-router-1 \
  -p 3022:3022 \
  -it \
  --rm \
  -v myPersistentZitiFiles:/persistent \
  openziti/ziti-router:0.30.3 \
  run /persistent/ziti-edge-router-1.yaml

Then you can modify the files all you like and it won't do those "quickstarty" things and you won't need to wait for 0.30.4

dovholuknf commented 1 year ago

if you compare the commands, you'll see very few differences that just come down to the last two lines: controller before:

  ...
  openziti/quickstart \
  /var/openziti/scripts/run-controller.sh

after:

  ...
  openziti/ziti-controller:0.30.3 \
  run /persistent/ziti-controller.yaml

router before:

  ...
  openziti/quickstart \
  /var/openziti/scripts/run-router.sh edge

after:

  ...
  openziti/ziti-router:0.30.3 \
  run /persistent/ziti-edge-router-1.yaml
dovholuknf commented 1 year ago

one final comment (sorry for the barrage), this ONLY works, because the quickstart did all the hard work ahead of time, making a pki, making the config files, etc. you can't just start from the 'no frills' container unless you go about doing those sorts of things the quickstart "just does".... Hope that helps.

mvelbaum commented 1 year ago

@dovholuknf thanks a lot!

I'm very thankful to quickstart for doing all the hard work! :) I have been getting intimate with those scripts as I wanted to change the advertised address of the controller and had to go into the controller container, delete the existing pki dir, source /var/openziti/scripts/ziti-cli-functions.sh and run createPki ; createControllerConfig ; addRouter "${ZITI_ROUTER_NAME}" "public" "public"

I guess now that it's mostly functional I'm going to convert the setup to something more stable. And hopefully figure out why BrowZer doesn't work.

dovholuknf commented 1 year ago

Did you follow along with the browzer example doc? https://openziti.io/docs/learn/quickstarts/browzer/example/ There's a walkthrough video on that page too that hopefully will help?

We have a discourse over at https://openziti.discourse.group/. You'll often get a bit better engagement there. We get "a lot" of github notifications and those notifications come to us in a bit better way.

Did you find that page? I'm gonna close this issue out but if you need more help, you can reopen this issue or post on the discourse. Cheers

dovholuknf commented 1 year ago

Also possibly helpful/relevant, a short video i narrated describing what the quickstart does is there too https://openziti.discourse.group/t/what-does-the-quickstart-do-that-i-need-to-do-myself/1600/4

mvelbaum commented 1 year ago

Did you follow along with the browzer example doc? https://openziti.io/docs/learn/quickstarts/browzer/example/ There's a walkthrough video on that page too that hopefully will help?

Yes, that was a great walkthrough (whoever wrote it is a very talented dude! ;)), but I hit a few snags. First, there was no arm64 image for the bootstrapper container, so I had to build it locally (no biggie). After finally launching it, I get this error on startup:

ziti-ziti-browzer-1  | {"timestamp": "2023-09-23T20:11:57.871Z", "level": "info", "message":  "ZITI_BROWZER_BOOTSTRAPPER_LOG_PATH is null"}
ziti-ziti-browzer-1  | {"level":"info","message":"ziti-browzer-bootstrapper initializing","timestamp":"2023-09-23T20:11:58.592Z","version":"0.38.0"}
ziti-ziti-browzer-1  | {"host":"ctrl.ziti.example.com","level":"info","message":"contacting specified controller","port":"8441","timestamp":"2023-09-23T20:11:58.601Z"}
ziti-ziti-browzer-1  | {"level":"debug","message":"configured target service(s)","targets":{"targetArray":[{"idp_client_id":"XXXXX","idp_issuer_base_url":"https://xxxxx.us.auth0.com","path":"/","scheme":"http","service":"homepage","vhost":"homepage.example.com"}]},"timestamp":"2023-09-23T20:11:58.692Z"}
ziti-ziti-browzer-1  | {"level":"info","message":"listening","port":"8446","scheme":"https","timestamp":"2023-09-23T20:11:58.719Z"}
ziti-ziti-browzer-1  | {"code":"SELF_SIGNED_CERT_IN_CHAIN","level":"error","message":"self signed certificate in certificate chain","stack":"Error: self signed certificate in certificate chain\n    at TLSSocket.onConnectSecure (node:_tls_wrap:1539:34)\n    at TLSSocket.emit (node:events:513:28)\n    at TLSSocket._finishInit (node:_tls_wrap:953:8)\n    at TLSWrap.ssl.onhandshakedone (node:_tls_wrap:734:12)","timestamp":"2023-09-23T20:11:58.744Z"}
ziti-ziti-browzer-1 exited with code 0

Not sure where the issue is, as I re-used the same letsencrypt certs I'm using for ZAC

ZITI_BROWZER_BOOTSTRAPPER_CERTIFICATE_PATH="/certs/fullchain.pem"
ZITI_BROWZER_BOOTSTRAPPER_KEY_PATH="/certs/key.pem"

We have a discourse over at https://openziti.discourse.group/. You'll often get a bit better engagement there. We get "a lot" of github notifications and those notifications come to us in a bit better way.

Did you find that page? I'm gonna close this issue out but if you need more help, you can reopen this issue or post on the discourse. Cheers

Yeah, I was just too lazy to create an account there :).

dovholuknf commented 1 year ago

First, there was no arm64 image for the bootstrapper container,

Oh good to know. I'll let curt know and file an issue.

Yeah, I was just too lazy to create an account there :).

Well lucky for you, you can just login with GitHub!!! :)

{"code":"SELF_SIGNED_CERT_IN_CHAIN","level":"error","message":"self signed certificate in certificate chain"

Are you, by chance, trying to use browzer with a TLS-enabled app? As in, does your "targetArray" have a "scheme" that's https not http? Curt's been chasing a bug around that but I am not sure if that's the problem or not, but I think that's what you're seeing? Assuming that's the case, can you try using an app that's not TLS-enabled on the far side for now?

dovholuknf commented 1 year ago

and, i went to file the issue but i see you already filed one! thanks :)

mvelbaum commented 1 year ago

No, the service uses HTTP. I will continue debugging it tomorrow.

Btw, after moving from quickstart to the "normal" containers, I now keep getting this error in the logs:

ziti-ziti-controller-1   | {"_context":"tls:0.0.0.0:8441","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.103/tls/listener.go:216","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","time":"2023-09-23T22:06:32.853Z"}
ziti-ziti-controller-1   | {"_context":"tls:0.0.0.0:8441","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.103/tls/listener.go:216","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","time":"2023-09-23T22:07:34.076Z"}
ziti-ziti-controller-1   | {"_context":"tls:0.0.0.0:8441","error":"EOF","file":"github.com/openziti/transport/v2@v2.0.103/tls/listener.go:216","func":"github.com/openziti/transport/v2/tls.(*sharedListener).processConn","level":"error","msg":"handshake failed","time":"2023-09-23T22:08:35.237Z"}

Any idea what this may be?

dovholuknf commented 1 year ago

That appears to me to be a client of some kind, usually a ziti-edge-tunnel, Ziti Desktop Edge for Mac/Windows which has an "old" or invalid PKI. I would ensure the certificate is valid for the address the client is trying to connect to.

This happens if the PKI is regenerated, or if the 'advertised address' does not match the one found in the certs presented. It happens to me a lot because I routinely destroy my overlay network, and forget to remove old identities in my ZDEW (ziti desktop edge for windows)

mvelbaum commented 1 year ago

You are right, I restarted my home edge-router and it seemed to have fixed the issue. Not quite sure how to debug the BrowZer issue tbh, it seems that even setting ZITI_BROWZER_BOOTSTRAPPER_SCHEME to http (as well as the target's scheme) still gives that SELF_SIGNED_CERT_IN_CHAIN error :/.

dovholuknf commented 1 year ago

I've reached out to @rentallect, the gent developing browzer. We'll need to get his attention and eyes on it.

mvelbaum commented 1 year ago

@dovholuknf I think I figured it out! It appears that the controller's edge endpoint serves me the self-signed cert. Here's the identity section from ziti-controller.yaml:

    identity:
      ca:          "/persistent/pki/ziti-edge-controller-root-ca/certs/ziti-edge-controller-root-ca.cert"
      key:         "/persistent/pki/ziti-edge-controller-intermediate/keys/ctrl.example.com-server.key"
      server_cert: "/persistent/pki/ziti-edge-controller-intermediate/certs/ctrl.example.com-server.chain.pem"
      cert:        "/persistent/pki/ziti-edge-controller-intermediate/certs/ctrl.example.com-client.cert"
      alt_server_certs:
      - server_cert: "/certs/fullchain.pem"
        server_key:  "/certs/key.pem"

I think the issue here is that both the quickstart generated certs and my alt_server_certs have the same CN since the quickstart script used the edge advertised address. What do you suggest I change the config to?

P.S.: I noticed that /persistent/pki/ziti-edge-controller-intermediate has the following files:

-rw-r--r-- 1 ziggy zitiweb 2065 Sep 23 22:56 ctrl.example.com-client.cert
-rw-r--r-- 1 ziggy zitiweb 2175 Sep 23 22:56 ctrl.example.com-server.cert
-rw-r--r-- 1 ziggy zitiweb 6363 Sep 23 22:56 ctrl.example.com-server.chain.pem
-rw-r--r-- 2 ziggy zitiweb 2094 Sep 23 22:56 ziti-edge-controller-intermediate.cert
-rw-r--r-- 1 ziggy zitiweb 4188 Sep 23 22:56 ziti-edge-controller-intermediate.chain.pem

And /persistent/pki/ziti-edge-controller-intermediate/keys/:

-rw-r--r-- 1 ziggy zitiweb 3272 Sep 23 22:56 ctrl.example.com-client.key
-rw-r--r-- 1 ziggy zitiweb 3272 Sep 23 22:56 ctrl.example.com-server.key
-rw-r--r-- 2 ziggy zitiweb 3272 Sep 23 22:56 ziti-edge-controller-intermediate.key

I wonder if I should use the files named ziti-edge-controller-intermediate instead of the ones that have the conflicting CN?

P.P.S. That seemed to have solved the self-signed cert issue! I hope I did the right thing :)

dovholuknf commented 1 year ago

That is EXACTLY the problem, yes. I need to make this more obvious in that doc. I hit the same problem myself the first time but I probably forgot to make this crystal clear in the doc. If the same domain is used for the overlay network pki, and the "alt server" certs, SNI becomes non deterministic as both sets of certs are for the same domain. That's why you're router restart fixed the situation, but it's only because you were lucky.... I'll make a note to make that much more clear.

The only solution is to serve certs from a different domain so the SNI is not able to be confused.

Since you already have a functioning overlay network, I would make and use a different domain name for the alt certs. Or if you're ok with redoing the quickstart, you could generate a pki for the quickstart with a different name and keep the alt certs domain. (But it means you need to reset the entire overlay)

I think I'm going to file an issue to introspect the certs and ensure they don't have a domain collision since this is a hard issue to track down. Good on you for discovering it!

The other, more complex option, would be to ONLY regenerate the server certs generated by the quickstart and ensure those don't overlap and contain a SANS entry that's the same as the alt certs...

Hope that helps