qubole / rubix

Cache File System optimized for columnar formats and object stores
Apache License 2.0
183 stars 74 forks source link

Document Rubix Configuration #128

Closed vrajat closed 5 years ago

vrajat commented 6 years ago

Current README has the following mark down code:

Configurations

BookKeeper server configurations

These configurations are to be providing as hadoop configs while started the BookKeeper server

Configuration Default Description
hadoop.cache.data.bookkeeper.port 8899 The port on which BookKeeper server will listen
hadoop.cache.data.bookkeeper.max-threads unbounded Maximum number of threads BookKeeper can launch
hadoop.cache.data.block-size 1048576 The size in bytes in which the file is logically divided internally. Higher value means lesser space requirement for metadata but can cause reading of more additional data than needed
hadoop.cache.data.dirprefix.list /media/ephemeral Prefixes for paths of directories used to store cached data. Final paths created by appending suffix in range [0, 5] followed by fcache.
hadoop.cache.data.fullness.percentage 80 Percentage of total disk space to use for caching and backing files are deleted in an LRU way.
hadoop.cache.data.expiration unbounded How long data is kept in cache
FileSystem configurations

These configurations need to be provided by the engine which is going to use RubiX

Configuration Default Description
hadoop.cache.data.enabled true Control using cache or not
hadoop.cache.data.strict.mode false By default RubiX tries not to fail read requests if there are some errors and tries to fallback to reading directly from remote source. Setting this config to true will fail read request if there were errors in RubiX.
hadoop.cache.data.location.blacklist empty Regex blacklisting locations that should not be cached