superfly / litefs

FUSE-based file system for replicating SQLite databases across a cluster of machines
Apache License 2.0
3.78k stars 89 forks source link

Disk IO errors with simple usage #28

Closed jwhear closed 1 year ago

jwhear commented 1 year ago

I've tried working through the README directions but consistently run into Disk IO errors. I'm using the 0.1.0 release. I've distilled everything down to a simple Bash script for reproducing:

#!/bin/bash
set -xe

# Kernel and OS info for debugging
uname -r
cat /etc/os-release

# Assumes you want to place a specific version of the LiteFS binary in the local
#  directory for testing
LITEFS_CMD="./litefs"

function cleanup {
    # Terminate all child processes
    pkill -P $$
}
trap cleanup EXIT

consul agent -dev &
sleep 2 # give Consul a few seconds to start

# We'll create everything in a scratch directory called "repro"
# Blow away left-overs from any previous run
rm -rf repro
mkdir -p repro/data

cat <<EOF >repro/litefs.yml
mount-dir: repro/data
debug: true
http:
    addr: ":20202"

consul:
    url: "http://localhost:8500"
    advertise-url: "http://localhost:20202"
EOF

$LITEFS_CMD -config repro/litefs.yml &
sleep 2 # give LiteFS a few seconds to start

echo "====== Running SQLite against the test database now ========"
sqlite3 -column repro/data/test.db <<EOF
PRAGMA journal_mode;
CREATE TABLE test (foo INT);
INSERT INTO test VALUES (1);
EOF

echo "Success!"

The output of running this is attached repro.log

benbjohnson commented 1 year ago

@jwhear Thanks for such an excellent, detailed repro script. I'll take a look and see what's going on.

jwhear commented 1 year ago

I've found an additional data point by Dockerizing: on Alpine it works great, on Fedora I continue to see disk IO errors.

# Works great
docker run --rm -it (docker build -qf Dockerfile.alpine)

# Disk IO error
docker run --rm -it --cap-add SYS_ADMIN --device /dev/fuse (docker build -qf Dockerfile.fedora)

Dockerfile.alpine Dockerfile.fedora

jwhear commented 1 year ago

Debian-based test also fails for me although at this point I suspect the host system's FUSE device is the culprit: docker run --rm -it --cap-add SYS_ADMIN --device /dev/fuse (docker build -qf Dockerfile.debian)

Dockerfile.debian

I don't know anything about FUSE, but here's what version I've got on the host OS:

) fusermount3 -V
fusermount3 version: 3.10.5