Closed dineshbvadhia closed 10 years ago
Hi,
first of all I think the official rocksdb github page (https://github.com/facebook/rocksdb/issues) or facebook group (https://www.facebook.com/groups/rocksdb.dev/) is a far more better place to ask this
Regarding your question. It seems to me that rocksdb was designed especially for ssd, but I also have good experience on hdd's. As far as I know they don't have special requirements on the filesystem. They try to use fallocate to create the db files, but fall back to posix_fallocate in case the calls is not present. So in general I would say "It works".
On performance I can only guess => I would recommend to drive a benchmark on your own. If you compile rocksdb, there is already a tool available called './db_bench'. With this tool you can do some benchmarks on ssd/hdd/network filesystem to see how it behaves. You can have a look here https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks how they use the tool. However don't compare your results with the absolute numbers on that page. They used very high end hardware for these tests (FusionIO devices). I was referencing this page only to show you how this tool works.
Hi Stephan
Thanks for getting back. I asked the same question independently on the facebook page and got the same answer as yours.
I'm doing tests with large data sets on a cluster and unfortunately there is only a (very fast) network attached filesystem available. Production system would have directly attached storage.
I use python and so will start using pyrocksdb in the next day or so. The initial requirements are very basic - first, write lots of data to populate the db and then it is mainly read.
Best ... Dinesh
From: stephan-hof Sent: Saturday, May 31, 2014 12:29 AM To: stephan-hof/pyrocksdb Cc: dineshbvadhia Subject: Re: [pyrocksdb] rocksdb and network filesystem (#9)
Hi,
first of all I think the official rocksdb github page (https://github.com/facebook/rocksdb/issues) or facebook group (https://www.facebook.com/groups/rocksdb.dev/) is a far more better place to ask this
Regarding your question. It seems to me that rocksdb was designed especially for ssd, but I also have good experience on hdd's. As far as I know they don't have special requirements on the filesystem. They try to use fallocate to create the db files, but fall back to posix_fallocate in case the calls is not present. So in general I would say "It works".
On performance I can only guess => I would recommend to drive a benchmark on your own. If you compile rocksdb, there is already a tool available called './db_bench'. With this tool you can do some benchmarks on ssd/hdd/network filesystem to see how it behaves. You can have a look here https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks how they use the tool. However don't compare your results with the absolute numbers on that page. They used very high end hardware for these tests (FusionIO devices). I was referencing this page only to show you how this tool works.
— Reply to this email directly or view it on GitHub.
Hi,
if you have the following workload
you may want to look at http://pyrocksdb.readthedocs.org/en/v0.2.1/api/database.html#rocksdb.DB.compact_range.
It is just an idea inspired by this paragraph https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks#test-1-bulk-load-of-keys-in-random-order I never used it myself on production data, but you may get a speedup on insert/reads.
I think the question is answered => closing the ticket.
Is this the best place to ask general questions?
I know rocksdb is not a distributed db, but is it designed to work with a network filesystem ie. machine runs rocksdb but db is on filesystem on network?