Closed thiko closed 2 years ago
Hi, @thiko. If you ended with an empty database after inserting several items, and then you deleted all of them, yes it can be normal. Internally clover uses badger, which is a log structured key value store. This means that when you delete an object, a "delete" record is appended to the value log. Logs should be compacted in the background, once the database reaches a suitable size.
EDIT: It seems that inside badger, space must be reclaimed by explicitly calling the RunValueLogGC
function. Currently this function is never called. I will write a fix for this. Thank you for reporting this issue
Hi @ostafen! Thanks for your response!
On my local setup, badger seems to reserve 2 GB on disk space immediately before even apply any operation to it (beside open the connection).
Anyway - I'm looking forward for your fix using the GC call :+1:
On my local setup, badger seems to reserve 2 GB on disk space immediately before even apply any operation to it (beside open the connection).
So you get a 2GB file even just after calling Open()
on an empty directory?
Can you post a snipped of code showing your usage? It will help me to have the whole picture :=)
Sure - it's a small playground project so far. Please find the related code parts below:
// package and other imports
import cloverdb "github.com/ostafen/clover"
type clover struct {
db *cloverdb.DB
}
func OpenCloverConnection(databaseName string) (*clover, error) {
con, err := cloverdb.Open(databaseName)
if err != nil {
return nil, err
}
return &clover{db: con}, nil
}
func (c *clover) CloseConnection() error {
return c.db.Close()
}
// package and other imports
func main() {
// Init. connection
db, err = storage.OpenCloverConnection("test-clover-db2")
if err != nil {
panic("Unable to open database connection!")
}
defer db.CloseConnection()
}
That's it ;)
Using go versio 1.18
mod.go contains github.com/ostafen/clover v1.0.1
Operating system is Ubuntu 18.04.6 LTS
When running go run .
it creates a new directory containing:
▶ ls -alth test-clover-db2
total 32K
drwxrwxr-x 2 thiko thiko 4,0K Apr 24 11:11 .
-rw-rw-r-- 1 thiko thiko 2,0G Apr 24 11:11 000001.vlog
-rw-rw-r-- 1 thiko thiko 128M Apr 24 11:11 00001.mem
-rw-rw-r-- 1 thiko thiko 1,0M Apr 24 11:11 DISCARD
-rw------- 1 thiko thiko 28 Apr 24 11:11 KEYREGISTRY
drwxrwxr-x 12 thiko thiko 4,0K Apr 24 11:11 ..
-rw-rw-r-- 1 thiko thiko 6 Apr 24 11:11 LOCK
-rw------- 1 thiko thiko 16 Apr 24 11:11 MANIFEST
Are you calling db.Close()
on the clover db instance inside db.CloseConnection()
?
Sorry, missed that part. I'll modify my original post to keep it together
@thiko by doing some quick tests it turns out that the problem is related to not calling db.Close()
properly.
Consider this snippet:
db, err := c.Open("./db")
if err != nil {
panic(err)
}
It produces this database:
-rw-rw-r-- 1 stefano stefano 2147483646 apr 24 11:32 000001.vlog
-rw-rw-r-- 1 stefano stefano 134217728 apr 24 11:32 00001.mem
-rw-rw-r-- 1 stefano stefano 1048576 apr 24 11:32 DISCARD
-rw------- 1 stefano stefano 28 apr 24 11:32 KEYREGISTRY
-rw-rw-r-- 1 stefano stefano 6 apr 24 11:32 LOCK
-rw------- 1 stefano stefano 16 apr 24 11:32 MANIFEST
But, if i correctly call db.Close()
:
db, err := c.Open("./db")
defer db.Close()
if err != nil {
panic(err)
}
what I get is the following:
-rw-rw-r-- 1 stefano stefano 20 apr 24 11:29 000001.vlog
-rw-rw-r-- 1 stefano stefano 1048576 apr 24 11:29 DISCARD
-rw------- 1 stefano stefano 28 apr 24 11:29 KEYREGISTRY
-rw------- 1 stefano stefano 16 apr 24 11:29 MANIFEST
@thiko, I run exactly the same snipped you posted. It should be fine. I'm getting the correct size since db.Close()
is called.
-rw-rw-r-- 1 stefano stefano 20 apr 24 11:29 000001.vlog
-rw-rw-r-- 1 stefano stefano 1048576 apr 24 11:29 DISCARD
-rw------- 1 stefano stefano 28 apr 24 11:29 KEYREGISTRY
-rw------- 1 stefano stefano 16 apr 24 11:29 MANIFEST
If i remove the call to db.Close()
I get the 2GB size
@ostafen you are right - the issue relates to a missing db.close()
call. I wasn't aware about the fact, that defer
is not called when killing the app using os signals (CTRL + C
in fact).
I'll handle the signal for a graceful shutdown.
My bad - sorry. Shall we close this issue?
@thiko, okay thank you :=)
Let's leave the issue open. Your problem of the initial 2GB size was due to db.Close()
not being called.
However, the problem related to the RunValueLogGC()
is a separate problem and needs a fix.
I'll work on it and release a fix by tomorrow :=)
Great - thanks for your help!
Hello @thiko @ostafen,
Thanks for checking this. I was facing this issue and I can also confirm that the problem was related to not using db.Close
. Now the database is just around 1 MB
I made the title of the issue more self-explainatory in order to help people facing the same issue to quickly locate this
With the last commit, I implemented a simple garbage collection strategy which basically calls RunValueLog()
periodically in the background every 5 minutes.
In the future, I will improve the GC strategy to run only during inactivity periods, but for now its does its job :=)
Hi all
First of all: thanks for your work - it's tackling my need and works pretty well :) One quick question: Is it intended for a nearly empty database to use > 2 GB disk space?
Example (see 000006.vlog):
Thanks in advance!