ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.05k stars 3.01k forks source link

Running out of space in the repo might result in a deadlocked datastore #5041

Open schomatis opened 6 years ago

schomatis commented 6 years ago

While adding content if the repo runs out of disk space the datastore might enter into a deadlock. I'm submitting this issue and a simple PoC to discuss how likely this scenario might be and if some precautionary measures should be taken. Although my main focus (and where this problem is more likely) is the Badger datastore this is also manifested for the flatfs datastore.

The summary is that if the partition gets completely filled (e.g., while adding content or by force of other system sharing the partition with IPFS) the natural ipfs repo gc command the user would try to release space fails as it first tries to write to the repo (this also happens for other simpler diagnostic commands). This seemed like an obvious problem in the Badger datastore (because of its inner workings) but a test for the flat datastore also has similar issues (as I'm seeing the datastore writes to a manifest file I wasn't aware of).

Note the following PoC uses a tiny chunk size (50 bytes) to fill the partition up to the top, this is not realistic and it's just a simple test, what I would like to know is how likely the partition would get "really" full by means of a normal operation.

# Create a 10 MB partition for the repository.
export IPFS_PATH=/tmp/ipfs-tmpfs
mkdir -p $IPFS_PATH
sudo mount -t tmpfs -o size=10m tmpfs $IPFS_PATH

# Use `flatfs` datatsore.
rm -rf $IPFS_PATH/*
ipfs init --empty-repo
dd if=/dev/urandom | ipfs add --chunker size-50 # Absurdly small chunk size
# Error: write /tmp/ipfs-tmpfs/blocks/MS/put-019222324: no space left on device

ipfs repo gc 2>&1 | head -c 1000
# Error: write /tmp/ipfs-tmpfs/datastore/MANIFEST-000006: no space left on device

ipfs repo stat
# Error: write /tmp/ipfs-tmpfs/datastore/MANIFEST-000006: no space left on device

# Now repeat with Badger.
rm -rf $IPFS_PATH/*
ipfs init --empty-repo --profile badgerds
dd if=/dev/urandom | ipfs add --chunker size-50
# 2018/05/28 14:43:39 ERROR in Badger::writeRequests: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device
# Error: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device
# 2018/05/28 14:43:39 ERROR in Badger::writeRequests: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device

# Additionally, sometimes a panic may happen (this should be reported to Badger).
# panic: runtime error: invalid memory address or nil pointer dereference
# [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xecc842]

ipfs repo gc 2>&1 | head -c 1000
# 2018/05/28 14:45:51 ERROR in Badger::writeRequests: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device
# 2018/05/28 14:45:51 ERROR in Badger::writeRequests: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device

# The `repo stat` at least can retrieve some information. 
ipfs repo stat
# NumObjects: 53515
# RepoSize:   10463024
# StorageMax: 10000000000
# RepoPath:   /tmp/ipfs-tmpfs
# Version:    fs-repo@6
# 2018/05/28 14:46:19 ERROR in Badger::writeRequests: Unable to write to value log file: "/tmp/ipfs-tmpfs/badgerds/000000.vlog": write /tmp/ipfs-tmpfs/badgerds/000000.vlog: no space left on device

A situation like this is more severe for Badger as the system would need to write much more information (in the form of key deletion entries) to force an actual GC and release of space (that couldn't be cleanly done by removing specific key files from the partition as is the case for flatfs).

whyrusleeping commented 6 years ago

@schomatis is that manifest file in flatfs the repo size? or is that leveldb being opened up?

schomatis commented 6 years ago

@whyrusleeping I'm not familiar with flatfs internals so I don't really know what is that manifest file for (also not sure what leveldb is being used for).