tinkerbell / playground

Example deployments of the Tinkerbell Stack for use as playground environments
Apache License 2.0
127 stars 85 forks source link

x.5009 cert error on docker-compose #127

Closed chechuironman closed 2 years ago

chechuironman commented 2 years ago

hey ,

i'm running the docker compose quick start and Im getting this error:

{"level":"info","ts":1648207945.5831676,"caller":"boots/dhcp.go:91","msg":"retrieved job is empty","service":"github.com/tinkerbell/boots","pkg":"main","type":"DHCPDISCOVER","mac":"ac:1f:6b:c7:ba:da","err":"discover from dhcp message: get hardware by mac from tink: rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for 192.168.56.4, 127.0.0.1, not 10.126.118.60\"","errVerbose":"rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: x509: certificate is valid for 192.168.56.4, 127.0.0.1, not 10.126.118.60\"\nget hardware by mac from tink\ngithub.com/tinkerbell/boots/packet.(client).DiscoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/packet/endpoints.go:108\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:17\ngithub.com/golang/groupcache/singleflight.(Group).Do\n\t/home/github/go/pkg/mod/github.com/golang/groupcache@v0.0.0-20190702054246-869f871628b6/singleflight/singleflight.go:56\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:19\ngithub.com/tinkerbell/boots/job.CreateFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/job.go:106\nmain.dhcpHandler.serveDHCP\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:89\nmain.dhcpHandler.ServeDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:50\ngithub.com/gammazero/workerpool.startWorker\n\t/home/github/go/pkg/mod/github.com/gammazero/workerpool@v0.0.0-20200311205957-7b00833861c6/workerpool.go:218\nruntime.goexit\n\t/opt/actions-runner/_work/_tool/go/1.16.3/x64/src/runtime/asm_amd64.s:1371\ndiscover from dhcp message"}

The VM from where Im running docker-compose has to interfaces (public and private), the bare metal are on the private network, which this VM hold this IP 10.126.118.60... any idea why Im getting this cert error?

jacobweinstock commented 2 years ago

I'm seeing that same thing. Looks like maybe the generate-tls-certs container in the docker-compose is not working correctly.

ylxxwx commented 2 years ago

I still see the same issue with this change.

jacobweinstock commented 2 years ago

I still see the same issue with this change.

Did you delete the certs docker volume? Might need to do that. Would you mind posting the logs from the generate tls container?

ylxxwx commented 2 years ago

Let me try.

ylxxwx commented 2 years ago

It works now after I deleted the volume and restart. Thanks very much.

jacobweinstock commented 2 years ago

It works now after I deleted the volume and restart. Thanks very much.

No problem. Thanks for the feedback! I updated the PR to note the need to delete the certs volume and restart.

chechuironman commented 2 years ago

hey,

What commands are you running to delete the certs...I took down the env with ' docker-compose down -v'...and when I rebuild it I still getting the same error

jacobweinstock commented 2 years ago

hey,

What commands are you running to delete the certs...I took down the env with ' docker-compose down -v'...and when I rebuild it I still getting the same error

hmm...interesting. This is what i did.

docker-compose down -v --remove-orphans
git checkout main
git pull
docker-compose up -d
double-p commented 2 years ago

might be dangling ./deploy/compose/state/webroot/workflow/ca.pem

ylxxwx commented 2 years ago

stop /kill the container first. Then use the following command to remove the volume. $ docker volume prune