Zygo / bees

Best-Effort Extent-Same, a btrfs dedupe agent
GNU General Public License v3.0
647 stars 55 forks source link

besstats.txt not updating #178

Closed SampsonF closed 3 years ago

SampsonF commented 3 years ago

This is the first time I am trying to run bees . Thank you very much for sharing bees with us.

I start with an empty btrfs FS_TREE and it is receiving a btrfs send from another disk for 1 hour (480GB / 2.1T done)

I can see there are crawl activities after the initial wait of 1099s .

I can see /run/bees/UUID.status, .beeshome/beescrawl.dat and .beeshome/beeshast.dat are being updated .

But .beeshome/beesstats.txt is not.

Do I missing something?

I am following https://copr.fedorainfracloud.org/coprs/elxreno/bees/ as the configuration guide.

This is my config file:

## Config for Bees: /etc/bees/beesd.conf.sample
## https://github.com/Zygo/bees
## It's a default values, change it, if needed

# How to use?
# Copy this file to a new file name and adjust the UUID below

# Which FS will be used
UUID=ae208b98-d49b-4542-91ae-e2cfce5cf8b0

## System Vars
# Change carefully
WORK_DIR="/run/bees/"
MNT_DIR="$WORK_DIR/mnt/$UUID"
BEESHOME="$MNT_DIR/.beeshome"
BEESSTATUS="$WORK_DIR/$UUID.status"

## Options to apply, see `beesd --help` for details
# OPTIONS="--strip-paths --no-timestamps"

## Bees DB size
# Hash Table Sizing
# sHash table entries are 16 bytes each
# (64-bit hash, 52-bit block number, and some metadata bits)
# Each entry represents a minimum of 4K on disk.
# unique data size    hash table size    average dedup block size
#     1TB                 4GB                  4K
#     1TB                 1GB                 16K
#     1TB               256MB                 64K
#     1TB                16MB               1024K
#    64TB                 1GB               1024K
#
# Size MUST be multiple of 128KB
# DB_SIZE=$((1024*1024*1024)) # 1G in bytes

uname -a Linux bees 5.11.12-300.fc34.x86_64 #1 SMP Wed Apr 7 16:31:13 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

btrfs version btrfs-progs v5.11.1

systemd unit file

cat beesd@ae208b98-d49b-4542-91ae-e2cfce5cf8b0.service 
[Unit]
Description=Bees (%i)
Documentation=https://github.com/Zygo/bees
After=sysinit.target

[Service]
Type=simple
ExecStart=/usr/sbin/beesd --no-timestamps %i
CPUAccounting=true
CPUSchedulingPolicy=batch
CPUWeight=12
IOSchedulingClass=idle
IOSchedulingPriority=7
IOWeight=10
KillMode=control-group
KillSignal=SIGTERM
MemoryAccounting=true
Nice=19
Restart=on-abnormal
StartupCPUWeight=25
StartupIOWeight=25

[Install]
WantedBy=basic.target
kakra commented 3 years ago

I'm not sure if this is a CPU scheduler starving issue but this may improve behavior: https://github.com/Zygo/bees/pull/135 (it probably needs some final thoughts from Zygo, I'm not even sure if it is worth the efforts, I made this out of curiosity)

OTOH, the beestats.txt from .beeshome/ may not be updated in a timely manner, just every hour or so, the runtime stats file that you can set with BEESSTATUS is a more recent source of stats (but doesn't log the histogram):

# /etc/systemd/system/bees.service
[Unit]
Description=Bees
Documentation=https://github.com/Zygo/bees
After=sysinit.target
RequiresMountsFor=/mnt/btrfs-pool

[Service]
Type=simple
Environment=BEESSTATUS=%t/bees/bees.status
ExecStart=/usr/libexec/bees --no-timestamps --strip-paths --thread-factor 0.5 --loadavg-target=5 --verbose=5 /mnt/btrfs-pool
...

HTH

SampsonF commented 3 years ago

Thank you.

I just check again, the timestamp of .beeshome/beesstats.txt is 05:25 localtime.

So it did got updated after all.

Zygo commented 3 years ago

$BEESSTATUS is updated once per second. It's intended for a tmpfs or ramfs like /run, and gives near-real-time insight into what the bees threads are doing.

$BEESHOME/beesstats.txt is updated once an hour. It's intended for persistent storage and to give a snapshot of stats counters.