lf-edge / eden

Eden is where EVE and Adam get tried and tested:
https://projecteve.dev
Apache License 2.0
49 stars 47 forks source link

eden/qemu: start QMP logger to log QEMU events #1004

Closed rouming closed 1 month ago

rouming commented 1 month ago

Nice to have QMP events from QEMU, probably can help to debug sudden reboots/reset issues.

This commit uses simple shell "echo | socat" cmd for reading qemu qmp events. Probably makes sense to rewrite on native go.

QEMU creates a socket file and uses is as server socket. socat attaches to it, sends handshake message and redirects output (qmp events) to the qmp.log file. if socket file was not created within 5 seconds, the startQMPLogger() complains, but this error is not considered as fatal and eden continues.

rouming commented 1 month ago

So folks (@uncleDecart @milan-zededa), I need some help. First of all very first commit smoothly works locally, but does not work on GH. The problem which I chase for a couple of days in a row is different content of currentPath, for example:

    qmpSock := filepath.Join(currentPath, defaults.DefaultDist, "qmp.sock")

(quite common construction used everywhere in eden to get some path). This qmpSock has the following path when I run eden locally: /home/roman/devel/zededa/eden/dist, and dist folder exists (contains all the files generated by eden) to the point when my code is executed, so I'm able to create qmp.log and qmp.sock files.

When I run same patch on GH, I see that qmpSock is equal to the following path: /tmp/go-test-script854418973/script-eden_start/dist, and dist in this particular case does not exist, so I'm not able to create any files in this folder. In actual action GH output I see these paths quite often: /home/runner/actions-runner/_work/eden/eden/eden/dist/default-eve.pid, so I was expecting, that currentPath would be equal to that, but this is not the case.

So I'm super confused and is not clear to me how to build the correct path for some files. Any help is appreciated.

rouming commented 1 month ago

Ok, the problem is solved by taking directory of the evePidFile, which always exists in the correct directly. I failed to create my own configuration due to the fact that it is always empty, I failed to use the currentPath due to the fact, that it is different from actual runner dist directory /home/runner/actions-runner/_work/eden/eden/eden/dist (I assume this is because we do os.Chdir() somewhere and current work directory becomes different only for GH runners). Anyway, for this minor PR this approach should work and does not look super ugly. Just a little bit. Little bit disgusting and crooked. Little. Bit.

rouming commented 1 month ago

For those who care: this line is likely broken and has similar problems I had: https://github.com/lf-edge/eden/blob/04badba3bdf1053ae64a3773cf681c46528ab80e/pkg/openevec/eve.go#L148 Same construction of using the currentPath and attempt to create a file further in the utils.CreateUsbNetConfImg() call.

milan-zededa commented 1 month ago

Ok, the problem is solved by taking directory of the evePidFile, which always exists in the correct directly. I failed to create my own configuration due to the fact that it is always empty, I failed to use the currentPath due to the fact, that it is different from actual runner dist directory /home/runner/actions-runner/_work/eden/eden/eden/dist (I assume this is because we do os.Chdir() somewhere and current work directory becomes different only for GH runners). Anyway, for this minor PR this approach should work and does not look super ugly. Just a little bit. Little bit disgusting and crooked. Little. Bit.

This is fine for now. I will also investigate why it is broken. We haven't yet tried onboarding using (fake) USB stick on github runners, which would try to use usbImagePath. So if it has the same problem it has not been noticed yet.

uncleDecart commented 1 month ago

I'll rerun failed tests and will merge afterwards

uncleDecart commented 1 month ago

Failure isn't something to do with your PR, merging it