Closed wokalski closed 1 year ago
Here are the logs from the boots container:
{"level":"info","ts":1650192223.6683865,"caller":"dhcp4-go@v0.0.0-20190402165401-39c137f31ad3/handler.go:105","msg":"","service":"github.com/tinkerbell/boots","pkg":"dhcp","pkg":"dhcp","event":"recv","mac":"0c:42:a1:97:f6:48","via":"0.0.0.0","iface":"enp2s0f1","xid":"\"3d:45:49:49\"","type":"DHCPDISCOVER","secs":4}
{"level":"info","ts":1650192223.6685946,"caller":"boots/dhcp.go:78","msg":"parsed option82/circuitid","service":"github.com/tinkerbell/boots","pkg":"main","mac":"0c:42:a1:97:f6:48","circuitID":""}
{"level":"info","ts":1650192223.6712575,"caller":"boots/dhcp.go:91","msg":"retrieved job is empty","service":"github.com/tinkerbell/boots","pkg":"main","type":"DHCPDISCOVER","mac":"0c:42:a1:97:f6:48","err":"discover from dhcp message: get hardware by mac from tink: rpc error: code = Unknown desc = SELECT: sql: no rows in result set","errVerbose":"rpc error: code = Unknown desc = SELECT: sql: no rows in result set\nget hardware by mac from tink\ngithub.com/tinkerbell/boots/packet.(*client).DiscoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/packet/endpoints.go:108\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:17\ngithub.com/golang/groupcache/singleflight.(*Group).Do\n\t/home/github/go/pkg/mod/github.com/golang/groupcache@v0.0.0-20190702054246-869f871628b6/singleflight/singleflight.go:56\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:19\ngithub.com/tinkerbell/boots/job.CreateFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/job.go:106\nmain.dhcpHandler.serveDHCP\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:89\nmain.dhcpHandler.ServeDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:50\ngithub.com/gammazero/workerpool.(*WorkerPool).dispatch.func1\n\t/home/github/go/pkg/mod/github.com/gammazero/workerpool@v0.0.0-20200311205957-7b00833861c6/workerpool.go:169\nruntime.goexit\n\t/opt/actions-runner/_work/_tool/go/1.16.3/x64/src/runtime/asm_amd64.s:1371\ndiscover from dhcp message"}
{"level":"info","ts":1650192227.7058403,"caller":"dhcp4-go@v0.0.0-20190402165401-39c137f31ad3/handler.go:105","msg":"","service":"github.com/tinkerbell/boots","pkg":"dhcp","pkg":"dhcp","event":"recv","mac":"0c:42:a1:97:f6:48","via":"0.0.0.0","iface":"enp2s0f1","xid":"\"3d:45:49:49\"","type":"DHCPDISCOVER","secs":8}
{"level":"info","ts":1650192227.7061045,"caller":"boots/dhcp.go:78","msg":"parsed option82/circuitid","service":"github.com/tinkerbell/boots","pkg":"main","mac":"0c:42:a1:97:f6:48","circuitID":""}
{"level":"info","ts":1650192227.7088065,"caller":"boots/dhcp.go:91","msg":"retrieved job is empty","service":"github.com/tinkerbell/boots","pkg":"main","type":"DHCPDISCOVER","mac":"0c:42:a1:97:f6:48","err":"discover from dhcp message: get hardware by mac from tink: rpc error: code = Unknown desc = SELECT: sql: no rows in result set","errVerbose":"rpc error: code = Unknown desc = SELECT: sql: no rows in result set\nget hardware by mac from tink\ngithub.com/tinkerbell/boots/packet.(*client).DiscoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/packet/endpoints.go:108\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:17\ngithub.com/golang/groupcache/singleflight.(*Group).Do\n\t/home/github/go/pkg/mod/github.com/golang/groupcache@v0.0.0-20190702054246-869f871628b6/singleflight/singleflight.go:56\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:19\ngithub.com/tinkerbell/boots/job.CreateFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/job.go:106\nmain.dhcpHandler.serveDHCP\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:89\nmain.dhcpHandler.ServeDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:50\ngithub.com/gammazero/workerpool.startWorker\n\t/home/github/go/pkg/mod/github.com/gammazero/workerpool@v0.0.0-20200311205957-7b00833861c6/workerpool.go:218\nruntime.goexit\n\t/opt/actions-runner/_work/_tool/go/1.16.3/x64/src/runtime/asm_amd64.s:1371\ndiscover from dhcp message"}
{"level":"info","ts":1650192235.779676,"caller":"dhcp4-go@v0.0.0-20190402165401-39c137f31ad3/handler.go:105","msg":"","service":"github.com/tinkerbell/boots","pkg":"dhcp","pkg":"dhcp","event":"recv","mac":"0c:42:a1:97:f6:48","via":"0.0.0.0","iface":"enp2s0f1","xid":"\"3d:45:49:49\"","type":"DHCPDISCOVER","secs":12}
{"level":"info","ts":1650192235.7798727,"caller":"boots/dhcp.go:78","msg":"parsed option82/circuitid","service":"github.com/tinkerbell/boots","pkg":"main","mac":"0c:42:a1:97:f6:48","circuitID":""}
{"level":"info","ts":1650192235.7824285,"caller":"boots/dhcp.go:91","msg":"retrieved job is empty","service":"github.com/tinkerbell/boots","pkg":"main","type":"DHCPDISCOVER","mac":"0c:42:a1:97:f6:48","err":"discover from dhcp message: get hardware by mac from tink: rpc error: code = Unknown desc = SELECT: sql: no rows in result set","errVerbose":"rpc error: code = Unknown desc = SELECT: sql: no rows in result set\nget hardware by mac from tink\ngithub.com/tinkerbell/boots/packet.(*client).DiscoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/packet/endpoints.go:108\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:17\ngithub.com/golang/groupcache/singleflight.(*Group).Do\n\t/home/github/go/pkg/mod/github.com/golang/groupcache@v0.0.0-20190702054246-869f871628b6/singleflight/singleflight.go:56\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:19\ngithub.com/tinkerbell/boots/job.CreateFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/job.go:106\nmain.dhcpHandler.serveDHCP\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:89\nmain.dhcpHandler.ServeDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:50\ngithub.com/gammazero/workerpool.(*WorkerPool).dispatch.func1\n\t/home/github/go/pkg/mod/github.com/gammazero/workerpool@v0.0.0-20200311205957-7b00833861c6/workerpool.go:169\nruntime.goexit\n\t/opt/actions-runner/_work/_tool/go/1.16.3/x64/src/runtime/asm_amd64.s:1371\ndiscover from dhcp message"}
{"level":"info","ts":1650192251.8749926,"caller":"dhcp4-go@v0.0.0-20190402165401-39c137f31ad3/handler.go:105","msg":"","service":"github.com/tinkerbell/boots","pkg":"dhcp","pkg":"dhcp","event":"recv","mac":"0c:42:a1:97:f6:48","via":"0.0.0.0","iface":"enp2s0f1","xid":"\"3d:45:49:49\"","type":"DHCPDISCOVER","secs":16}
{"level":"info","ts":1650192251.8751898,"caller":"boots/dhcp.go:78","msg":"parsed option82/circuitid","service":"github.com/tinkerbell/boots","pkg":"main","mac":"0c:42:a1:97:f6:48","circuitID":""}
{"level":"info","ts":1650192251.8778894,"caller":"boots/dhcp.go:91","msg":"retrieved job is empty","service":"github.com/tinkerbell/boots","pkg":"main","type":"DHCPDISCOVER","mac":"0c:42:a1:97:f6:48","err":"discover from dhcp message: get hardware by mac from tink: rpc error: code = Unknown desc = SELECT: sql: no rows in result set","errVerbose":"rpc error: code = Unknown desc = SELECT: sql: no rows in result set\nget hardware by mac from tink\ngithub.com/tinkerbell/boots/packet.(*client).DiscoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/packet/endpoints.go:108\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:17\ngithub.com/golang/groupcache/singleflight.(*Group).Do\n\t/home/github/go/pkg/mod/github.com/golang/groupcache@v0.0.0-20190702054246-869f871628b6/singleflight/singleflight.go:56\ngithub.com/tinkerbell/boots/job.discoverHardwareFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/fetch.go:19\ngithub.com/tinkerbell/boots/job.CreateFromDHCP\n\t/opt/actions-runner/_work/boots/boots/job/job.go:106\nmain.dhcpHandler.serveDHCP\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:89\nmain.dhcpHandler.ServeDHCP.func1\n\t/opt/actions-runner/_work/boots/boots/cmd/boots/dhcp.go:50\ngithub.com/gammazero/workerpool.(*WorkerPool).dispatch.func1\n\t/home/github/go/pkg/mod/github.com/gammazero/workerpool@v0.0.0-20200311205957-7b00833861c6/workerpool.go:169\nruntime.goexit\n\t/opt/actions-runner/_work/_tool/go/1.16.3/x64/src/runtime/asm_amd64.s:1371\ndiscover from dhcp message"}
Ok, I'm super confused. I understand where it's coming from:
The hardware spec inserted into the database is hardcoded rather than dynamically generated based on the worker. I believe it's going to work after I insert a proper hardware definition.
It is however, super unclear looking at the docs that I should do that, and honestly, it probably could be terraformed, too? It's too glaring of an oversight to be real; I must've overlooked something in the docs, but not sure what.
This does sound very not right :D, can you retry but using the code from #126 ?
I haven't used it, adding a correct hardware definition did work. I can't see how your PR fixes it though; it doesn't change anything about the hardware definitions, they are not converted into templates (as they should be).
My gut feeling is that somehow someone got it working for them consistently because the MAC seems to be not-so-random. When I created and destroyed the worker multiple times IIRC it got the same MAC.
I haven't used it, adding a correct hardware definition did work. I can't see how your PR fixes it though; it doesn't change anything about the hardware definitions, they are not converted into templates (as they should be).
My gut feeling is that somehow someone got it working for them consistently because the MAC seems to be not-so-random. When I created and destroyed the worker multiple times IIRC it got the same MAC.
I've used the tf setup a bunch on w/e machines EM ends up provisioning so there's no way a MAC stays the same. It gets updated here https://github.com/tinkerbell/sandbox/blob/main/deploy/compose/create-tink-records/create.sh#L20-L28. This happens (in my branch) by way of:
docker-compose up
is run it will pick up the worker's mac address (https://github.com/mmlb/tinkerbell-sandbox/blob/terraform-love/deploy/compose/docker-compose.yml#L191-L197) and update the hardware description before feeding it into tink https://github.com/mmlb/tinkerbell-sandbox/blob/terraform-love/deploy/compose/create-tink-records/create.sh#L20-L28Indeed, my bad! Ok, it definitely didn't work on master for some reason. I hope your branch has a fix for it.
@wokalski did #126 fix things for you?
I didnt test it. I made it work with my local tweaks. I hope it does though !
Hello @wokalski , I'm trying to do exactly the same thing: running terraform sandbox from my macOS to spin up Provisioner and Worker with Equinix metal but the workflow stucks in the PENDING state. Pls, can you share your tweaks? TIA.
@CAcquaviva I did make it work but I didn't end up productizing this setup. Tinker bell undergoing a huge transition when it comes to internals. The issue you're hitting is most likely:
If you are thinking about creating a production setup using tinker bell and you have a small network I'd encourage you to take a look at matchbox from Poseidon. I really like the architecture of tinker bell but it's just too much work in progress now in my opinion.
The project has moved on quite a bit since the issue was raised, namely we no longer use the Postgres backend and the tink CLI has been deprecated.
This may still be an issue but its unclear what the next steps are. We'll take an action to validate the Terraform setup separately and raise issues as needed.
Two disclaimers:
After I reboot the
tink_worker
for the first time it doesn't get provisioned. (after runningterraform apply
).My first intuition was a networking issue, especially that I can see a couple of "this doesn't work as it's supposed to" in the terraform file. I'll run a tcpdump on the server on port 67 to check. That said, the network does seem to be set up correctly when I check it on the Equinix Metal portal.It's not a networking issue.If that's correct, I'm going to tinker in the worker itself, maybe I'm hitting #130? It's a bit odd though, I reran it a couple of times and it consistently didn't work.
Expected Behaviour
Tink-worker connects to the provisioner and one can see the worker under
tink workflow events
Current Behaviour
The workflow is stuck in the
PENDING
state.Steps to Reproduce (for bugs)
Run the instructions from here
Context
I was just trying to take Tinkerbell for a spin!
Your Environment
Im running it on macOS, I'm using the terraform sandbox with Equinix metal.