Closed evandeaubl closed 1 year ago
I can reproduce this issue, and it doesn't seem to be anything Talos at the moment, it's a regular mount on the host.
Also interesting that it only happens for the first time, if you run the command once again, it succeeds immediately. It might be something ext4-related. Needs more digging.
At the time when it "hangs", it's blocked in the Linux kernel:
Re-doing your reproducer with fsType: xfs
in the PV "fixes" the problem:
# time dd if=/data/zerofile of=/dev/null bs=1K count=1
1+0 records in
1+0 records out
real 0m 0.00s
user 0m 0.00s
sys 0m 0.00s
So I think it's something ext4 specific coupled with slow I/O performance of the QEMU volume (talosctl cluster create
was never optimized for performance, probably with .qcow volumes it would be way better).
okay, I know that this is, and thanks for reporting this bug!
Bug Report
Description
When I attempt to read large files (multiple GB) in volumes mounted into pods on a Talos cluster, the time to read just one byte out of the file is insanely long, and seems to scale with the size of the file. I've watched transfer metrics, and it looks like the entire file is getting read in at open time (!?!). None of the readahead settings I know of look out of whack (read_ahead_kb setting for the device is 128KB, filesystem readaheads look okay, but probably aren't relevant in light of the info in the next paragraph).
So far, it does not matter what filesystem is mounted, or what CSI the mount is using; I have replicated using RBD, CephFS, geesefs S3, OpenEBS lvm-localpv, and plain old built-in local volumes. hostPath mounts and local volume mounts on the system disk are the only mounts where I haven't been able to replicate this behavior.
I have not been able to replicate this issue when I installed the same version of k3s or k8s via kubeadm in the same environment, which is why I'm thinking this is a Talos issue.
Logs
Haven't found anything in logs that seems relevant, although when I add an
strace
to the seconddd
in the reproducer below, the long hang occurs during theopenat()
call opening the file from the mounted volume. Let me know if there are any logs you would like me to provide, but hopefully the reproducer is simple enough that you can replicate in your environment.Environment
Reproducer
sudo --preserve-env=HOME talosctl cluster create --provisioner qemu --extra-disks 1 --extra-disks-size 20480
.pvc.yaml.txt alpine.yaml.txt
kubectl exec -it pod/alpine -- /bin/sh
dd if=/dev/zero of=/data/zerofile bs=4M count=4096
to create a 16GB file.time dd if=/data/zerofile of=/dev/null bs=1K count=1
to read the first KB of that file.dd
is very long.Sample reproducer output (on my dev laptop with QEMU VMs running on NVME)