Open visheshtanksale opened 1 week ago
@littlejawa We're baffled why we cannot start any container with CRIO and Kata. Any command:
that we've tried ends in command not found
PTAL.
Not sure if it's related, but it sounds similar to what we fixed in the CI with https://github.com/kata-containers/kata-containers/pull/9206
Can you double-check the crio config (in /etc/crio/crio.conf, and any files under /etc/crio/crio.conf.d/), and make sure that you have something like:
[crio]
storage_option = [
"overlay.skip_mount_home=true",
]
@littlejawa I didnt have the storage option param. But adding that doesnt help. Still hitting the same error
Sounds weird because this is the exact same symptom and situation. Did you reload crio after adding it to the conf?
I have a tentative fix for this in kata-deploy - it needs to set that flag as part of crio config. Waiting on your feedback before pushing it, in case something else needs to be fixed.
@littlejawa I did reload crio service. This is the config change
# cat /etc/crio/crio.conf.d/99-kata-deploy | grep -A 3 -B 1 storage_option
[crio]
storage_option = [
"overlay.skip_mount_home=true",
]
I am still seeing the same error. Do you think there might any other reason for this issue?
I don't remember seeing this kind of error, except with this config issue :-( Can you get an updated crio.log and kata-collect-data.sh output, to see if crio complains about the new setting somehow? Maybe it can tell us why it's not taking it into account.
I don't see anything obvious either :-(
Comparing with my working setup, there is one thing that is different: the default for the "storage_driver" is empty in your crio config, and it is "overlay" for me. This should not change anything, as the entry is commented out anyway. But I'm wondering if you're actually using the overaly driver? If not, the option we modified might have no effect.
Can you verify the content of your /etc/containers/storage.conf file, and check which driver is used by default? (should be at the very beginning of the file).
Alternatively, can you uncomment the line storage_driver = ""
in your crio.conf file, and make it : storage_driver = "overlay"
?
I did try that. Its coming up with appropriate configs.
Current CRI-O configuration:\n[crio]\n root = \"/var/lib/containers/storage\"\n runroot = \"/run/containers/storage\"\n imagestore = \"\"\n storage_driver = \"overlay\"\n storage_option = [\"overlay.skip_mount_home=true\"]\n
Its not helping with issue.
Also I am running ubuntu 22.04
# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jamm
Are you on RHEL? May be that might be causing different behaviors?
Are you on RHEL? May be that might be causing different behaviors?
Most of my testing is done on Ubuntu 22.04.4, but without kubernetes - just crio + kata. My version of cri-o is probably a bit more recent because I'm building it from main, but that shouldn't count, because I've been testing with this kind of setup for a very long time, with all previous versions of crio.
From what I can tell, the only difference I have with your setup is that I didn't use kata-deploy. I can see that kata deploy doesn't set that flag on crio, and I could test that without that flag I get the exact same problem that you have... I just tested it again this morning to be sure.
At this point, I really don't know :-(
@wainersm @fidencio, Sorry for the ping, but I'm lost here, and you have some more experience than me with kubernetes testing :-(
@visheshtanksale has a Kubernetes cluster using crio, setup with kata-deploy. They get an error on every pod creation saying the entrypoint for the pod is "not found". kata-deploy doesn't set the "skip_mount_home" flag for crio, so I made them change that setting... and it doesn't solve the problem :-(
Any idea what else could cause the same symptom?
Are you on RHEL? May be that might be causing different behaviors?
Most of my testing is done on Ubuntu 22.04.4, but without kubernetes - just crio + kata. My version of cri-o is probably a bit more recent because I'm building it from main, but that shouldn't count, because I've been testing with this kind of setup for a very long time, with all previous versions of crio.
From what I can tell, the only difference I have with your setup is that I didn't use kata-deploy. I can see that kata deploy doesn't set that flag on crio, and I could test that without that flag I get the exact same problem that you have... I just tested it again this morning to be sure.
At this point, I really don't know :-(
Can you share how does your kata
runtime config look like?
Here it is : configuration-qemu.toml
I don't think I modified it from the default, except the debug log level.
I realize that you were asking the cri-o config for kata maybe? Here it is, just in case. Again, nothing different.
[crio.runtime.runtimes.kata]
runtime_path = "/opt/kata/bin/containerd-shim-kata-v2"
runtime_root = "/run/vc"
runtime_type = "vm"
privileged_without_host_devices = true
runtime_config_path = "/opt/kata/share/defaults/kata-containers/configuration-qemu.toml"
runtime_pull_image = false
[crio.runtime.runtimes.kata-remote]
runtime_path = "/opt/kata/bin/containerd-shim-kata-v2"
runtime_root = "/run/vc"
runtime_type = "vm"
privileged_without_host_devices = true
runtime_config_path = "/opt/kata/share/defaults/kata-containers/configuration-remote.toml"
runtime_pull_image = true
@littlejawa How do you install kata on your setup? I am trying to figure out whats the difference between your setup and the kata-deploy setup.
I'm retrieving the release archive from https://github.com/kata-containers/kata-containers/releases Specifically: I'm currently testing 3.4.0
I'm just unpacking it to /, so the folder structure is what the tarball contains.
Then I configure crio manually, by adding the entries that I posted above. I'm also adding the following in crio's conf:
# Set a flag in crio settings to avoid private bind mount
[crio]
storage_option = [
"overlay.skip_mount_home=true",
]
# Set debug logs in crio
[crio.runtime]
log_level = "debug"
I'm using separate conf files under /etc/crio/crio.conf.d/ for my config: one for the runtime definition, and one for the storage and debug flags.
And finally, I edit the .toml files for kata config, to enable debug logs.
Description of problem
Setup Kata using kata deploy on CRI-O. When I create a pod using
The pod does not come up it is stuck in
CreateContainerError
statusLogs show this error
Attached complete logs here
If I try to bring up any other container using
kata-qemu
runtime I get similar error that the command which is entrypoint of the container is not foundExpected result
The pod should come up without error
Actual result
The pod does not come up it is stuck in
CreateContainerError
statusFurther information
Attached kata-collect-data.sh output here
Kata Containers survey
Please consider taking the survey to help us help you: https://openinfrafoundation.formstack.com/forms/kata_containers_user_survey