ocurrent / obuilder

Experimental "docker build" alternative using btrfs/zfs snapshots
Apache License 2.0
60 stars 17 forks source link

tests fail with flakey error due to `Unix.ENOENT` on `db.sqlite-shm` #186

Open shonfeder opened 3 months ago

shonfeder commented 3 months ago

Seen in jobs like https://ocaml.ci.dev/github/ocurrent/obuilder/commit/cc9415851fb42cfa66386d3da41f5c1be8c4fd59/variant/ubuntu-22.04-5.2_opam-2.2

test.exe: [INFO] b1: "(from base)"
test.exe: [INFO] Base image not present; importing "base"…
test.exe: [INFO] Exec "sudo" "--" "mkdir" "-m" "755" "--" "/tmp/build_fc6ae3_dune/mock-store-df582c/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs"
exec: [|sudo; --; mkdir; -m; 755; --;
        /tmp/build_fc6ae3_dune/mock-store-df582c/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs|]
test.exe: [INFO] Exec "docker" "create" "--" "base"
exec: [|docker; create; --; base|]
test.exe: [INFO] Exec "docker" "export" "--" "base-7"
exec: [|docker; export; --; base-7|]
test.exe: [INFO] Exec "sudo" "--" "tar" "-C" "/tmp/build_fc6ae3_dune/mock-store-df582c/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs" "-xf" "-"
exec: [|sudo; --; tar; -C;
        /tmp/build_fc6ae3_dune/mock-store-df582c/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs;
        -xf; -|]
test.exe: [INFO] Exec "docker" "rm" "--force" "--" "base-7"
exec: [|docker; rm; --force; --; base-7|]
docker rm --force "base-7"
test.exe: [INFO] Exec "docker" "image" "inspect" "--format" "{{range .Config.Env}}{{print . "\x00"}}{{end}}" "--" "base"
exec: [|docker; image; inspect; --format;
        {{range .Config.Env}}{{print . "\x00"}}{{end}}; --; base|]
test.exe: [INFO] b1: "---> saved as \"cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2\""
test.exe: [INFO] b1: "/: (run (shell Wait))"
test.exe: [INFO] b1: "Wait\n"
test.exe: [INFO] b2: "(from base)"
test.exe: [INFO] b2: "---> using \"cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2\" from cache"
test.exe: [INFO] b2: "/: (run (shell Wait))"
test.exe: [INFO] b2: "Wait\n"
test.exe: [INFO] User cancelled job (users now = 1)
ASSERT User 1 result
test.exe: [INFO] User cancelled job (users now = 0)
ASSERT User 2 result
ASSERT Build cancelled
[exception] Unix.Unix_error(Unix.ENOENT, "lstat", "/tmp/build_fc6ae3_dune/mock-store-df582c/state/db/db.sqlite-shm")
            Raised by primitive operation at Lwt_unix.self_result in file "src/unix/lwt_unix.cppo.ml", line 246, characters 14-31

Logs saved to `/src/_build/default/test/_build/_tests/OBuilder/build.007.output'.
shonfeder commented 3 months ago

This is flaky and sometimes resolves on rebuilds.

shonfeder commented 3 months ago

Likely related to #179

shonfeder commented 3 months ago

This is also happening with the test secrets, so I think this confirms it is a problem with some flakiness in the test set up (e.g. the file being removed before a concurrent operation tries to get the file stats) rather than an issue with the logic of a specific test:

  [FAIL]        secrets               1   No secret provided.

┌──────────────────────────────────────────────────────────────────────────────┐
│ [FAIL]        secrets               1   No secret provided.                  │
└──────────────────────────────────────────────────────────────────────────────┘
test.exe: [INFO] b: "(from base)"
test.exe: [INFO] Base image not present; importing "base"…
test.exe: [INFO] Exec "sudo" "--" "mkdir" "-m" "755" "--" "/tmp/build_a8920b_dune/mock-store-2451b5/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs"
exec: [|sudo; --; mkdir; -m; 755; --;
        /tmp/build_a8920b_dune/mock-store-2451b5/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs|]
test.exe: [INFO] Exec "docker" "create" "--" "base"
exec: [|docker; create; --; base|]
test.exe: [INFO] Exec "docker" "export" "--" "base-12"
exec: [|docker; export; --; base-12|]
test.exe: [INFO] Exec "sudo" "--" "tar" "-C" "/tmp/build_a8920b_dune/mock-store-2451b5/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs" "-xf" "-"
exec: [|sudo; --; tar; -C;
        /tmp/build_a8920b_dune/mock-store-2451b5/cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2-tmp/rootfs;
        -xf; -|]
test.exe: [INFO] Exec "docker" "rm" "--force" "--" "base-12"
exec: [|docker; rm; --force; --; base-12|]
docker rm --force "base-12"
test.exe: [INFO] Exec "docker" "image" "inspect" "--format" "{{range .Config.Env}}{{print . "\x00"}}{{end}}" "--" "base"
exec: [|docker; image; inspect; --format;
        {{range .Config.Env}}{{print . "\x00"}}{{end}}; --; base|]
test.exe: [INFO] b: "---> saved as \"cae662172fd450bb0cd710a769079c05bfc5d8e35efa6576edc7d0377afdd4a2\""
test.exe: [INFO] b: "/: (run (secrets (test (target /run/secrets/test)))\n        (shell Append))"
ASSERT Final result
[exception] Unix.Unix_error(Unix.ENOENT, "lstat", "/tmp/build_a8920b_dune/mock-store-2451b5/state/db/db.sqlite-wal")
            Raised by primitive operation at Lwt_unix.self_result in file "src/unix/lwt_unix.cppo.ml", line 246, characters 14-31