k3s-io / kine

Run Kubernetes on MySQL, Postgres, sqlite, dqlite, not etcd.
Apache License 2.0
1.56k stars 233 forks source link

error while range on ...: disk I/O error: no such file or directory #135

Closed ncopa closed 2 years ago

ncopa commented 2 years ago

We (k0s team) are experiencing a really weird error in kine which appears to be a kine (or maybe even sqlite) bug:

Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="F0901 13:23:18.129957    2228 controller.go:166] Unable to perform initial IP allocation check: unable to refresh the service IP block: rpc error: code = Unknown desc = disk I/O error: no such file or directory" component=kube-apiserver
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/prioritylevelconfigurations/ /registry/prioritylevelconfigurations/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/minions/ /registry/minions/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/pods/ /registry/pods/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/serviceaccounts/ /registry/serviceaccounts/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/runtimeclasses/ /registry/runtimeclasses/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/limitranges/ /registry/limitranges/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/persistentvolumes/ /registry/persistentvolumes/: disk I/O error: no such file or directory\"" component=kine
Sep 01 13:23:18 lima-lens-vm-lima k0s[1074]: time="2022-09-01 13:23:18" level=info msg="time=\"2022-09-01T13:23:18Z\" level=error msg=\"error while range on /registry/ingressclasses/ /registry/ingressclasses/: disk I/O error: no such file or directory\"" component=kine

When this happens kine seems to be completely broken and cluster does not start up. Restarting k0s (and kine with it) appears to solve it.

kine is started as:

/var/lib/k0s/bin/kine --endpoint=sqlite:///var/lib/k0s/db/state.db?more=rwc&_journal=WAL&cache=shared --listen-address=unix:///run/k0s/kine/kine.sock:2379

It happens occationally when we reboot the vm, maybe once every 1-5 reboot, but not every reboot.

We have been able to reproduce it on both x86_64 and arm64 MacOS hosts and on a x86_64 linux host. We have tested kernel 5.10.127, 5.10.136 and some 5.18. This is a debian 11 vm.

We believe we have ruled out filesystem corruption. fsck.ext4 -n /dev/vda2 reports the filesystem to be clean and sqlite> pragma integrity_check; claims the db file to be ok.

The logs seems to indicate that kine is gracefully shut down but I am not 100% sure it is. Even if it isn't I think kine should be able recover.

I have tried the officially built kine version v0.9.3 (1c94e77) binary and a custom built kine binary from git master with github.com/mattn/go-sqlite3 v1.14.15. (the latter seemed to make it harder to reproduce the error, but still happened after 9 to 11 reboots).

strace seems to indicate that it is a stat call that returns ENOENT when looking for a state.db-journal. I suspect that this is not a critical error, but that the error reporting is buggy and propagates the wrong error code. The ENOENT (no such file or directory) does not make sense to me.

It also seems to be timing related. re-ordring the services appears to make it happen less frequent and so does setting psow=0. We are still able to reproduce it. Just less frequent.

Any ideas what this might be or what we can do to debug this further?

brandond commented 2 years ago

This is probably deep in the sqlite code and not something we can fix in Kine, but I have a few questions:

ncopa commented 2 years ago

This is probably deep in the sqlite code and not something we can fix in Kine, but I have a few questions:

  • When this happens, does kine start and run normally for a while, or does raise errors immediately on startup?

It happens during startup.

  • Are there any steps you can perform to reproduce this on demand?

Unfortunately, I don't have any simple stable reliable way to reproduce it. It only happens occationally when rebooting the vm. So far we have not been able to reproduce it without rebooting the machine.

The way I have been able to reproduce it is to use lima with the following config:

cpus: 2
memory: "2GiB"
disk: "50GiB"
env:
  K0S_VERSION: v1.23.8+k0s.0
images:
- location: "https://downloads.k8slens.dev/lens-local-vm-images/lens-vm-1233b84.amd64.qcow2"
  arch: "x86_64"
  digest: "sha512:76cbce47c9e4b10972fc97853e26b81ba172a13614153ca8637e22ed20a3949ee8ff40d058b42f82c8a788ccb4bd54c0d22dffcfc8a138a79db301003b58b99e"
- location: "https://downloads.k8slens.dev/lens-local-vm-images/lens-vm-1233b84.arm64.qcow2"
  arch: "aarch64"
  digest: "sha512:799d2198664315759361e155e924783c291c100a475e0254a9b33ea804c832af5e1d362dcca5433147ba24ca48648a481810d4ed97d5251a4b9283cb61ce2988"

mounts:
- location: "~"
  writable: true
containerd:
  system: false
  user: false
provision:
- mode: system
  script: |
    #!/bin/bash
    # Increase default max_user_watches from 8192 to help with development environments with many file watches
    echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
- mode: system
  script: |
    #!/bin/bash
    set -eux -o pipefail

    if [ -f /etc/systemd/system/k0scontroller.service ]
    then
      find /var/lib/k0s/ -maxdepth 2 -type f -size 0 | xargs rm || true
      exit 0
    fi

    if [ "$(/usr/local/bin/k0s version)" != "${K0S_VERSION}" ]
    then
      curl -sSfL --retry 5 https://get.k0s.sh | K0S_VERSION=${K0S_VERSION} sh
    fi

    k0s install controller --single \
      --config /etc/k0s.yaml \
      --cri-socket remote:unix:///var/run/cri-dockerd.sock \
      --kubelet-extra-args="--cgroup-driver=systemd --hostname-override=lens-desktop-kube"
    k0s start

    if ! timeout 30s bash -c "until k0s kc get nodes >/dev/null 2>&1; do sleep 3; done"; then
      echo >&2 "k0s control plane failed to start"
      exit 1
    fi

    k0s kc apply -f /etc/k0s/manifests/
probes:
- script: |
    #!/bin/bash
    set -eux -o pipefail

    if ! timeout 30s bash -c "until sudo docker ps >/dev/null 2>&1; do sleep 3; done"; then
      echo >&2 "docker is not yet running"
      exit 1
    fi
  description: "probe docker"
- script: |
    #!/bin/bash
    set -eux -o pipefail
    if ! timeout 30s bash -c "until sudo k0s kc get nodes >/dev/null 2>&1; do sleep 3; done"; then
      echo >&2 "k0s is not yet running"
      exit 1
    fi
  description: "probe k0s"
- script: |
    #!/bin/bash
    set -eux -o pipefail

    if ! timeout 30s bash -c "until pgrep dockerd >/dev/null 2>&1; do sleep 3; done"; then
      echo >&2 "docker is not yet running"
      exit 1
    fi

    sudo chmod 0666 /var/run/docker.sock

    gunzip -c /var/lib/k0s/binfmt.tar.gz | sudo docker image load
    if [ "$(arch)" = "aarch64" ]
    then
      sudo docker run --rm --privileged tonistiigi/binfmt:qemu-v6.2.0 --install amd64
    else
      sudo docker run --rm --privileged tonistiigi/binfmt:qemu-v6.2.0 --install arm64
    fi
  description: "probe docker process"
portForwards:
- guestPort: 10257
  ignore: true
- guestPort: 10259
  ignore: true
- guestPort: 10250
  ignore: true
- guestPort: 10249
  ignore: true
- guestPort: 8080
  ignore: true
- guestPort: 179
  ignore: true
- guestSocket: "/var/run/docker.sock"
  hostSocket: "{{.Dir}}/sock/docker.sock"
message: |
  To run `docker` on the host (assumes docker-cli is installed), run the following commands:
  ------
  docker context create lens-vm --docker "host=unix://{{.Dir}}/sock/docker.sock"
  docker context use lens-vm
  docker run hello-world
  ------

Then I use this script in a loop to reproduce it:

$ cat doesitwork.sh 
#!/bin/sh

while ! pidof kube-scheduler; do sleep 1; done

The loop:

$ for i in $(seq 0 100); do echo "# $i ####################"; limactl stop lens-vm-lima ; sleep 2s; limactl start lens-vm-lima && limactl shell lens-vm-lima -- timeout 15 sh ./doesitwork.sh || break; done
  • Can you share more information on why you're customizing the sqlite journal/caching configuration?

Maybe @jnummelin can help us with that? (k0sproject/k0s@8e6a38ae6dab47f2c66125ce65d2ecbd37b25e1c and k0sproject/k0s@d3a7c1509cbb73ad527069f2582f121cf18d2c5b)

  • Do you see this error if you don't customize these parameters?

Yes. I tried remove the parameters, and it happened after 5 reboots.

ncopa commented 2 years ago

Is there some way to find out the exact SQL query that is performed when "error while range on /registry/pods/ /registry/pods/" is generated?

brandond commented 2 years ago

if you start the kine binary with --debug it will log all the queries at TRACE level. It'll be very noisy.

Can you strip your test down to just starting kine and running an etcdctl query against it, or does it only happen when using k0s? I do think it's odd that we've never seen this in k3s.

ncopa commented 2 years ago

I was not able to reproduce the issue with --debug and I have not yet been able to reproduce it with kine + etcdctl alone. Changing the dependencies of unrelated systemd services seems to "fix" it.

Based on the symptoms I'd guess it is a race condition of some sort.

ncopa commented 2 years ago

After have spent a couple of days for this, I think problem may be in go-sqlite3.

I modified it to print a stack trace when printing error:

index 9c0f4d8..43d1c1f 100644
--- a/sqlite3.go
+++ b/sqlite3.go
@@ -827,11 +827,15 @@ func lastError(db *C.sqlite3) error {
                systemErrno = syscall.Errno(C.sqlite3_system_errno(db))
        }

+       b := make([]byte, 2048)
+       n := runtime.Stack(b, false)
+       stackStr := string(b[:n])
+
        return Error{
                Code:         ErrNo(rv),
                ExtendedCode: ErrNoExtended(extrv),
                SystemErrno:  systemErrno,
-               err:          errStr,
+               err:          fmt.Sprintf("%s: %s", errStr, stackStr),
        }
 }

This gave me errors like:

Sep 07 11:43:30 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:43:30" level=info msg="time=\"2022-09-07T11:43:30Z\" level=info msg=\"Kine available at unix:///run/k0s/kine/kine.sock:2379\"" component=kine
Sep 07 11:43:32 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:43:32" level=info msg="time=\"2022-09-07T11:43:32Z\" level=error msg=\"error in txn: disk I/O error: goroutine 4748 [running]:\\ngithub.com/mattn/go-sqlite3.lastError(0xc00162adb0?)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5\\ngithub.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:810\\ngithub.com/mattn/go-sqlite3.(*SQLiteStmt).execSync(0xc00162adb0, {0xc0015f77c0?, 0xc001603bc0?, 0xc001607ad0?})\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2093 +0x154\\ngithub.com/mattn/go-sqlite3.(*SQLiteStmt).exec.func1()\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2060 +0x32\\ncreated by github.com/mattn/go-sqlite3.(*SQLiteStmt).exec\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2059 +0x11b\\n: no such file or directory\"" component=kine
Sep 07 11:43:32 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:43:32" level=info msg="time=\"2022-09-07T11:43:32Z\" level=error msg=\"error in txn: disk I/O error: goroutine 4754 [running]:\\ngithub.com/mattn/go-sqlite3.lastError(0xc00162b380?)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5\\ngithub.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:810\\ngithub.com/mattn/go-sqlite3.(*SQLiteStmt).execSync(0xc00162b380, {0xc0015f7e00?, 0xc001642060?, 0xc001607c30?})\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2093 +0x154\\ngithub.com/mattn/go-sqlite3.(*SQLiteStmt).exec.func1()\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2060 +0x32\\ncreated by github.com/mattn/go-sqlite3.(*SQLiteStmt).exec\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2059 +0x11b\\n: no such file or directory\"" component=kine
...
Sep 07 11:45:19 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:45:19" level=info msg="time=\"2022-09-07T11:45:19Z\" level=error msg=\"error while range on /registry/configmaps/kube-system/ /registry/configmaps/kube-system/: disk I/O error: goroutine 37788 [running]:\\ngithub.com/mattn/go-sqlite3.lastError(0xc001628ae0?)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5\\ngithub.com/mattn/go-sqlite3.(*SQLiteDriver).Open.func5({0xc001628ae0?, 0x19?})\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:1482 +0x5c\\ngithub.com/mattn/go-sqlite3.(*SQLiteDriver).Open(0xc00018e7e0, {0x7fff7feb9ecb, 0x3b})\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:1698 +0x3857\\ndatabase/sql.dsnConnector.Connect(...)\\n\\t/usr/lib/go/src/database/sql/sql.go:759\\ndatabase/sql.(*DB).conn(0xc0002e88f0, {0x109bcf8, 0xc000f142a0}, 0x1)\\n\\t/usr/lib/go/src/database/sql/sql.go:1393 +0x763\\ndatabase/sql.(*DB).query(0xc001a2a540?, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2}, 0x1f?)\\n\\t/usr/lib/go/src/database/sql/sql.go:1730 +0x5d\\ndatabase/sql.(*DB).QueryContext(0xc000619680?, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2})\\n\\t/usr/lib/go/src/database/sql/sql.go:1708 +0xda\\ngithub.com/k3s-io/kine/pkg/drivers/generic.(*Generic).query(0xc0003390e0, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2})\\n\\t/var/ncopa/src/kine/pkg/drivers/generic/generic.go:260 +0x1f6\\ngithub.com/k3s-io/kine/pkg/drivers/generic.(*Generic).ListCurrent(0xc0003390e0, {0x109bcf8, 0xc000f142a0}, {0xc00003e7b0, 0x22}, 0x1f5?, 0x0)\\n\\t/var/ncopa/src/kine/pkg/drivers/generic/generic.go:342 +0x18f\\ngithub.com/k3s-io/kine/pkg/logstructured/sqllog.(*SQLLog).List(0xc000401540, {0x109bcf8, 0xc000f142a0}, {0xc00003e750, 0x21}, {0xc00003e780, 0x21}, 0x649c99?, 0x0, 0x0)\\n\\t/var/ncopa/src/kine/pkg/logstructured/sqllog/sql.go:275 +0x1b8\\ngithub.com/k3s-io/kine/pkg/logstructured.(*LogStructured).List(0xc00019db60, {0x109bcf8, 0xc000f142a0}, {0xc00003e750, 0x21}, {0xc00003e780, 0x21}, 0x1f5, 0x0)\\n\\t/var/ncopa/src/kine/pkg/logstructured/logstructured.go:167 +0x16d\\ngithub.com/k3s-io/kine/pkg/server.(*LimitedServer).list(0xc00018ed80, {0x109bcf8, 0xc000f142a0}, 0xc0011745a0)\\n\\t/var/ncopa/src/kine/pkg/server/list.go:39 +0x2a8\\ngithub.com/k3s-io/kine/pkg/server.(*LimitedServer).Range(0x49?, {0x109b: no such file or directory\"" component=kine
Sep 07 11:45:19 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:45:19" level=info msg="time=\"2022-09-07T11:45:19Z\" level=error msg=\"error while range on /registry/roles/ /registry/roles/: disk I/O error: goroutine 37841 [running]:\\ngithub.com/mattn/go-sqlite3.lastError(0x4074d1?)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5\\ngithub.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:810\\ngithub.com/mattn/go-sqlite3.(*SQLiteRows).nextSyncLocked(0xc000793ce0, {0xc0007dafc0, 0x1, 0xc000116000?})\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2200 +0x8b\\ngithub.com/mattn/go-sqlite3.(*SQLiteRows).Next.func1()\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2174 +0x32\\ncreated by github.com/mattn/go-sqlite3.(*SQLiteRows).Next\\n\\t/home/ncopa/src/go-sqlite3/sqlite3.go:2173 +0x1a5\\n: no such file or directory\"" component=kine

The full trace: kine-trace4.txt

Re-formatting the output (replace \\n with newline and \\t with a tab) gives this:

Sep 07 11:43:32 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:43:32" level=info msg="time=\"2022-09-07T11:43:32Z\" level=error msg=\"error in txn: disk I/O error: goroutine 4748 [running]:
github.com/mattn/go-sqlite3.lastError(0xc00162adb0?)
        /home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5
github.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)
        /home/ncopa/src/go-sqlite3/sqlite3.go:810
github.com/mattn/go-sqlite3.(*SQLiteStmt).execSync(0xc00162adb0, {0xc0015f77c0?, 0xc001603bc0?, 0xc001607ad0?})
        /home/ncopa/src/go-sqlite3/sqlite3.go:2093 +0x154
github.com/mattn/go-sqlite3.(*SQLiteStmt).exec.func1()
        /home/ncopa/src/go-sqlite3/sqlite3.go:2060 +0x32
created by github.com/mattn/go-sqlite3.(*SQLiteStmt).exec
        /home/ncopa/src/go-sqlite3/sqlite3.go:2059 +0x11b
: no such file or directory\"" component=kine
Sep 07 11:43:32 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:43:32" level=info msg="time=\"2022-09-07T11:43:32Z\" level=error msg=\"error in txn: disk I/O error: goroutine 4754 [running]:
github.com/mattn/go-sqlite3.lastError(0xc00162b380?)
        /home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5
github.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)
        /home/ncopa/src/go-sqlite3/sqlite3.go:810
github.com/mattn/go-sqlite3.(*SQLiteStmt).execSync(0xc00162b380, {0xc0015f7e00?, 0xc001642060?, 0xc001607c30?})
        /home/ncopa/src/go-sqlite3/sqlite3.go:2093 +0x154
github.com/mattn/go-sqlite3.(*SQLiteStmt).exec.func1()
        /home/ncopa/src/go-sqlite3/sqlite3.go:2060 +0x32
created by github.com/mattn/go-sqlite3.(*SQLiteStmt).exec
        /home/ncopa/src/go-sqlite3/sqlite3.go:2059 +0x11b
: no such file or directory\"" component=kine

...

Sep 07 11:45:19 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:45:19" level=info msg="time=\"2022-09-07T11:45:19Z\" level=error msg=\"error while range on /registry/configmaps/kube-system/ /registry/configmaps/kube-system/: disk I/O error: goroutine 37788 [running]:
github.com/mattn/go-sqlite3.lastError(0xc001628ae0?)
        /home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5
github.com/mattn/go-sqlite3.(*SQLiteDriver).Open.func5({0xc001628ae0?, 0x19?})
        /home/ncopa/src/go-sqlite3/sqlite3.go:1482 +0x5c
github.com/mattn/go-sqlite3.(*SQLiteDriver).Open(0xc00018e7e0, {0x7fff7feb9ecb, 0x3b})
        /home/ncopa/src/go-sqlite3/sqlite3.go:1698 +0x3857
database/sql.dsnConnector.Connect(...)
        /usr/lib/go/src/database/sql/sql.go:759
database/sql.(*DB).conn(0xc0002e88f0, {0x109bcf8, 0xc000f142a0}, 0x1)
        /usr/lib/go/src/database/sql/sql.go:1393 +0x763
database/sql.(*DB).query(0xc001a2a540?, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2}, 0x1f?)
        /usr/lib/go/src/database/sql/sql.go:1730 +0x5d
database/sql.(*DB).QueryContext(0xc000619680?, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2})
        /usr/lib/go/src/database/sql/sql.go:1708 +0xda
github.com/k3s-io/kine/pkg/drivers/generic.(*Generic).query(0xc0003390e0, {0x109bcf8, 0xc000f142a0}, {0xc000619680, 0x23a}, {0xc0013f1960, 0x2, 0x2})
        /var/ncopa/src/kine/pkg/drivers/generic/generic.go:260 +0x1f6
github.com/k3s-io/kine/pkg/drivers/generic.(*Generic).ListCurrent(0xc0003390e0, {0x109bcf8, 0xc000f142a0}, {0xc00003e7b0, 0x22}, 0x1f5?, 0x0)
        /var/ncopa/src/kine/pkg/drivers/generic/generic.go:342 +0x18f
github.com/k3s-io/kine/pkg/logstructured/sqllog.(*SQLLog).List(0xc000401540, {0x109bcf8, 0xc000f142a0}, {0xc00003e750, 0x21}, {0xc00003e780, 0x21}, 0x649c99?, 0x0, 0x0)
        /var/ncopa/src/kine/pkg/logstructured/sqllog/sql.go:275 +0x1b8
github.com/k3s-io/kine/pkg/logstructured.(*LogStructured).List(0xc00019db60, {0x109bcf8, 0xc000f142a0}, {0xc00003e750, 0x21}, {0xc00003e780, 0x21}, 0x1f5, 0x0)
        /var/ncopa/src/kine/pkg/logstructured/logstructured.go:167 +0x16d
github.com/k3s-io/kine/pkg/server.(*LimitedServer).list(0xc00018ed80, {0x109bcf8, 0xc000f142a0}, 0xc0011745a0)
        /var/ncopa/src/kine/pkg/server/list.go:39 +0x2a8
github.com/k3s-io/kine/pkg/server.(*LimitedServer).Range(0x49?, {0x109b: no such file or directory\"" component=kine
Sep 07 11:45:19 lima-lens-vm-lima k0s[1073]: time="2022-09-07 11:45:19" level=info msg="time=\"2022-09-07T11:45:19Z\" level=error msg=\"error while range on /registry/roles/ /registry/roles/: disk I/O error: goroutine 37841 [running]:
github.com/mattn/go-sqlite3.lastError(0x4074d1?)
        /home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5
github.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)
        /home/ncopa/src/go-sqlite3/sqlite3.go:810
github.com/mattn/go-sqlite3.(*SQLiteRows).nextSyncLocked(0xc000793ce0, {0xc0007dafc0, 0x1, 0xc000116000?})
        /home/ncopa/src/go-sqlite3/sqlite3.go:2200 +0x8b
github.com/mattn/go-sqlite3.(*SQLiteRows).Next.func1()
        /home/ncopa/src/go-sqlite3/sqlite3.go:2174 +0x32
created by github.com/mattn/go-sqlite3.(*SQLiteRows).Next
        /home/ncopa/src/go-sqlite3/sqlite3.go:2173 +0x1a5
: no such file or directory\"" component=kine

Fill file: kine-trace4-formatted.txt

This looks very much like a problem with race conditions in go-sqlite3, and I believe there are problems more than one place.

brandond commented 2 years ago
level=error msg=\"error in txn: disk I/O error: goroutine 4748 [running]:
github.com/mattn/go-sqlite3.lastError(0xc00162adb0?)
        /home/ncopa/src/go-sqlite3/sqlite3.go:831 +0xc5
github.com/mattn/go-sqlite3.(*SQLiteConn).lastError(...)
        /home/ncopa/src/go-sqlite3/sqlite3.go:810
github.com/mattn/go-sqlite3.(*SQLiteStmt).execSync(0xc00162adb0, {0xc0015f77c0?, 0xc001603bc0?, 0xc001607ad0?})
        /home/ncopa/src/go-sqlite3/sqlite3.go:2093 +0x154

I wouldn't say that it's a race condition, it looks to me like it's just passing errors up from the sqlite C code, which is in return probably returning an error from a os syscall, which you confirmed when you said:

strace seems to indicate that it is a stat call that returns ENOENT when looking for a state.db-journal

Everything points to this being an OS or filesystem error that's being propagated up the stack. The journal file is disappearing out from underneath sqlite for some unknown reason.

Are you doing anything weird with the filesystems on this node? Using overlays, remounting things, mounting volumes over subdirectories after the service has started, or so on?

ncopa commented 2 years ago

Are you doing anything weird with the filesystems on this node? Using overlays, remounting things, mounting volumes over subdirectories after the service has started, or so on?

No, not as far I know. This is a very simple disk layout with everything on a single partition. I have ruled out filesystem level issues at this point.

strace seems to indicate that it is a stat call that returns ENOENT when looking for a state.db-journal

Everything points to this being an OS or filesystem error that's being propagated up the stack. The journal file is disappearing out from underneath sqlite for some unknown reason.

What puzzles me is why it is looking for -journal in the first place. The file is opened with _journal=WAL so there should never be any state.db-journal.

I wonder if the db is opened more than once? It should only be opened once, and then all the go routines should reuse the same connection, right?

brandond commented 2 years ago

What puzzles me is why it is looking for -journal in the first place. The file is opened with _journal=WAL so there should never be any state.db-journal.

The -journal file has nothing to do with the WAL, it's used for transactions: https://www.sqlite.org/tempfiles.html#rollback_journals

I wonder if the db is opened more than once? It should only be opened once, and then all the go routines should reuse the same connection, right?

Yes, it's only opened once by kine, unless you are somehow running multiple kine/k3s processes pointed at the same path, which isn't something we support.

ncopa commented 2 years ago

What puzzles me is why it is looking for -journal in the first place. The file is opened with _journal=WAL so there should never be any state.db-journal.

The -journal file has nothing to do with the WAL, it's used for transactions: https://www.sqlite.org/tempfiles.html#rollback_journals

Maybe I am missing something but first section says:

A rollback journal is a temporary file used to implement atomic commit and rollback capabilities in SQLite. (For a detailed discussion of how this works, see the separate document titled Atomic Commit In SQLite.)

The Atomic Commit In SQLite says:

The information in this article applies only when SQLite is operating in "rollback mode", or in other words when SQLite is not using a write-ahead log. SQLite still supports atomic commit when write-ahead logging is enabled, but it accomplishes atomic commit by a different mechanism from the one described in this article. See the write-ahead log documentation for additional information on how SQLite supports atomic commit in that context.

The write-ahead log documentation begins with:

The default method by which SQLite implements atomic commit and rollback is a rollback journal. Beginning with version 3.7.0 (2010-07-21), a new "Write-Ahead Log" option (hereafter referred to as "WAL") is available.

The section How WAL Works explains:

The traditional rollback journal works by writing a copy of the original unchanged database content into a separate rollback journal file and then writing changes directly into the database file. ...

I believe that is the state.db-journal. Then it continues:

The WAL approach inverts this. The original content is preserved in the database file and the changes are appended into a separate WAL file.

I believe that is is the state.db-wal. I also interpret all this as two exclusively different ways to do transactions. The default uses -journal, the WAL method does not use the -journal. Instead it uses -wal.

In this kine case, we are using _journal=WAL which as far I understand means that the state.db-journal should never be used.

So for some unexplained reason, it seems that sqlite gets confused and believes it is in "rollback mode", while in reality it is in WAL mode.

brandond commented 2 years ago

Yeah, it's hard to tell from the docs alone exactly why it would be trying to access the rollback journal in WAL mode. My recollection of the files in play is probably not super fresh as it's been a while since I went poking through the code. You might ask on the forums; I believe the maintainer is pretty active there: https://sqlite.org/forum/forum

Note that there are other mentions on the forum of the rollback journal still being accessed in unexpected modes:

ncopa commented 2 years ago

Yes, it's only opened once by kine, unless you are somehow running multiple kine/k3s processes pointed at the same path, which isn't something we support.

The backtraces indicates that the error also happens when calling Open. This happens after the error appears from other places, so it looks like there are multiple go routines that retries to open it. This was also surprising to me, as I expected it to be open only once, during startup, before listening to the socket.

brandond commented 2 years ago

This happens after the error appears from other places, so it looks like there are multiple go routines that retries to open it. This was also surprising to me, as I expected it to be open only once, during startup, before listening to the socket.

The sqlite code is pure C. Go-sqlite3 embeds the sqlite amalgamation using cgo; it is not a reimplementation in pure go. There's nothing in there that pins the database operations to a single goroutine, and there definitely may be multiple parallel goroutines accessing the database (and calling into the C code) at the same time. I would only expect the database to be opened once, but clearly the journal checks occur more frequently than that. I believe it checks whenever the database is being locked, but I'm not 100% on that.

I still say that this is all caused by something monkeying about with the underlying filesystem while kine is running though. I don't think you're going to find the answer in the sqlite code.

ncopa commented 2 years ago

Everything points to this being an OS or filesystem error that's being propagated up the stack. The journal file is disappearing out from underneath sqlite for some unknown reason.

Turns out that you were right about this.

Problem was in the yaml i posted above:

      find /var/lib/k0s/ -maxdepth 2 -type f -size 0 | xargs rm || true

this was executed while k0s was starting, and kine was running, and apparently it deleted the state.db-journal. Making the script keep state.db* solved the issue. So it was us doing stupid things.

Thank you very much for your help @brandond