Closed kapitan-k closed 6 years ago
It looks like some of your column families are being flushed aggressively while others aren't getting flushed at all. RocksDB interleaves data from all column families in the WAL file, so we can't archive/delete a WAL until all column families have flushed that WAL's data.
Does the total size of the WALs in your DB directory exceed 1GB (your max_total_wal_size
)? When it does, we should automatically trigger flushes in such a way that we're able to drop old WALs and reclaim space. You can try reducing max_total_wal_size
to see this more easily. Let us know if it's not happening.
Thanks for your quick reply. As far as I know all files with .log ending are always wals, correct? If true, the current total wal size in the wal_dir is ~36GB whilst having the max_total_wal_size of 1GB. (Line 10 of aboves LOG: "Write Ahead Log file in /perfdisk/bigdata_main: ...") with also a lots of wal files of zero bytes. So this is not intentional behavior, right? It seems like RocksDB just recovers the log files and afterwards ignores those.
I haven't archived old LOG files, so this can't help atm. I changed from a "normal" DB (where this never happened) to a DB with multiple paths at Oct 23 with same options but new data and I sporadically checked the directory. The oldest log file of the 36GB is from Nov 03 and up to then everything was fine.
So if you tell me above options are ok, I will start to investigate further, start with running memtest, store on different disk and keep log files. My system allows to somewhat replay the events stored in the DB, so I would start there.
But disk shouldn't be the problem anyway.
I copied the database to a different server and everything is running smoothly. The original server turned wild last days so I assume it to be a hardware failure and no problem with RocksDB. Thanks for your help.
Expected behavior
Expecting RocksDB to delete wal files.
Actual behavior
After some time of usage deletion stops.
Steps to reproduce the behavior
Could you please check my options to be sane? As you can see below, I set for example: WAL_ttl_seconds=60, WAL_size_limit_MB=360, max_total_wal_size=1073741824, delete_obsolete_files_period_micros=10000 I expect wal files to be deleted relatively early. Deletion also does not happen when above options are changed to for example: WAL_ttl_seconds=0, WAL_size_limit_MB=0, max_total_wal_size=0,
Here is my LOG for a normal DB open atm: