edgelesssys / constellation

Constellation is the first Confidential Kubernetes. Constellation shields entire Kubernetes clusters from the (cloud) infrastructure using confidential computing.
GNU Affero General Public License v3.0
903 stars 47 forks source link

debugd: use runc as podman runtime #3205

Closed burgerdev closed 1 week ago

burgerdev commented 1 week ago

Context

The logcollection deployed by debugd is not working since ~2024-06-08. Investigating a debug cluster, it turns out that podman commands fail because of an absent crun runtime. A change in defaults might have been introduced by 0a3f77e92634fd9b3434172000e13fc2f23129d7.

Proposed change(s)

Related issue

Checklist

netlify[bot] commented 1 week ago

Deploy Preview for constellation-docs ready!

Name Link
Latest commit 03b3a95a8101791efe686ef988e45c281bfe2d17
Latest deploy log https://app.netlify.com/sites/constellation-docs/deploys/667d85817ebf54000859351d
Deploy Preview https://deploy-preview-3205--constellation-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

burgerdev commented 1 week ago

I ran a manual e2e test, but there are still no logs. The container is running though, but it does not have logging enabled and thus I have no idea yet what the problem might be. Running a new image build to find out.

msanft commented 1 week ago

I ran a manual e2e test, but there are still no logs. The container is running though, but it does not have logging enabled and thus I have no idea yet what the problem might be. Running a new image build to find out.

No logs, or untagged? Untagged would be explained by the missing attribute when calling the e2e-test action in the manual e2e test workflow

burgerdev commented 1 week ago

As far as I can tell, no logs. I would have expected some for this run, for example: https://github.com/edgelesssys/constellation/actions/runs/9694776948.

burgerdev commented 1 week ago

filebeat does not look healthy:

{"log.level":"warn","@timestamp":"2024-06-27T13:58:47.873Z","log.logger":"input","log.origin":{"file.name":"v2/loader.go","file.line":102},"message":"EXPERIMENTAL: The journald input is experimental","service.name":"filebeat","input":"journald","stability":"Experimental","deprecated":false,"ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2024-06-27T13:58:47.874Z","log.logger":"input.journald","log.origin":{"file.name":"compat/compat.go","file.line":124},"message":"Input 'journald' failed with: input.go:130: input journald failed (id=journald)\n\tinput.go:174: failed to create reader for LOCAL_SYSTEM_JOURNAL journal (path=LOCAL_SYSTEM_JOURNAL): reader.go:99: failed to open local journal: unable to open a handle to the library","service.name":"filebeat","id":"journald","ecs.version":"1.6.0"}

That's an error while dlopening some shared library. I also don't see any systemd shared libs in the filebeat container. https://github.com/coreos/go-systemd/blob/7d375ecc2b092916968b5601f74cca28a8de45dd/sdjournal/functions.go#L30-L38

msanft commented 1 week ago

failed to open local journal: unable to open a handle to the library

This should be an addressable problem. I can see if I can fix it tomorrow.

However, it's still weird to me that this has worked previously. They might have not talked directly to the journal before an update?

burgerdev commented 1 week ago

Update from Fedora 38 to 40: https://github.com/edgelesssys/constellation/pull/3106/files.

$ docker run -it --rm --entrypoint /bin/bash ghcr.io/edgelesssys/constellation/filebeat-debugd:v2.17.0-pre.0.20240513104207-d76c9ac82de7@sha256:6567d682385c06b49f6d56fdf3f20d5c24809bbfced15b816f4717bf837fc776 -c "ls -l /usr/lib64/libsystemd*"
lrwxrwxrwx 1 root root     20 Mar 11 00:00 /usr/lib64/libsystemd.so.0 -> libsystemd.so.0.36.0
-rwxr-xr-x 1 root root 961864 Mar 11 00:00 /usr/lib64/libsystemd.so.0.36.0
$ docker run -it --rm --entrypoint /bin/bash ghcr.io/edgelesssys/constellation/filebeat-debugd:v2.17.0-pre.0.20240524110423-80917921e3d6@sha256:a58db8fef0740e0263d1c407f43f2fa05fdeed200b32ab58d32fb11873477231 -c "ls -l /usr/lib64/libsystemd*"
ls: cannot access '/usr/lib64/libsystemd*': No such file or directory
burgerdev commented 1 week ago

03b3a95 seems to have done the trick.

https://search-e2e-logs-y46renozy42lcojbvrt3qq7csm.eu-central-1.es.amazonaws.com/_dashboards/app/data-explorer/discover?security_tenant=global#?_a=(discover:(columns:!(_source),isDirty:!f,sort:!()),metadata:(indexPattern:'9004ee20-77cc-11ee-b137-27c60b9ad4a4',view:discover))&_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15m,to:now))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'9004ee20-77cc-11ee-b137-27c60b9ad4a4',key:metadata.github.run-id,negate:!f,params:(query:'9697184033'),type:phrase),query:(match_phrase:(metadata.github.run-id:'9697184033')))),query:(language:kuery,query:''))

github-actions[bot] commented 1 week ago

Coverage report

Package Old New Trend
debugd/filebeat [no test files] [no test files] :construction:
debugd/internal/debugd/logcollector 6.10% 6.10% :construction: