qubole / rubix

Cache File System optimized for columnar formats and object stores
Apache License 2.0
182 stars 74 forks source link

Available cache space is computed before cache directories are cleaned up during startup #454

Closed sopel39 closed 3 years ago

sopel39 commented 3 years ago

See BookKeeper#BookKeeper(org.apache.hadoop.conf.Configuration, com.qubole.rubix.common.metrics.BookKeeperMetrics, com.google.common.base.Ticker). Inside there is,

    initializeCache(conf, ticker);
    cleanupOldCacheFiles(conf);

However, initializeCache uses java.io.File#getUsableSpace to compute cache size. Therefore when node is restarted with let's say 80% cache partition occupied before, Rubix will use at most only 20% of remaining space.

Should these method calls be revered?

shubhamtagra commented 3 years ago

Yes, this needs to be fixed.

harmandeeps commented 3 years ago

Fixed in #455. Closing this issue. Thanks!

sopel39 commented 3 years ago

Will you create PR to PrestoSQL once Rubix is released?

harmandeeps commented 3 years ago

@sopel39 : yeah, I will create the PR.