BohuTANG / nessDB

A very fast transactional key-value, embedded database storage engine in Fractal-Tree. Teaching/Research purposes only.
900 stars 164 forks source link

benchmark feedback #9

Closed rafal98 closed 12 years ago

rafal98 commented 12 years ago

Hi, I repost here what I post on googlecode

Here the diff with my tunning: -#define KEYSIZE 20 -#define VALSIZE 100 -#define NUM 1000000 +#define KEYSIZE 16 +#define VALSIZE 24 +#define NUM 100000000

define R_NUM 10000

define REMOVE_NUM 10000

-#define BUFFERPOOL (1024_1024_1024) -#define BGSYNC (1) +#define BUFFERPOOL (2_1024_1024*1024) +#define BGSYNC (0)

@@ -88,10 +88,16 @@ void random_value() }

void random_key(char key,int length) { +/ char salt[36]= "abcdefghijklmnopqrstuvwxyz0123456789"; memset(key,0,length); for (int i = 0; i < length; i++) key[i] = salt[rand() % length]; +*/

What version of the product are you using? On what operating system?

Please provide any additional information below.

Some others information:

|Random-Write (done:99999999): 0.000035 sec/op; 28911.6 writes/sec(estimated); 2.6 MB/sec; cost:3458.821(sec) lch@hyperstream:/DT/local/nessDB/src$ du -k . 14781888 ./ndbs => 151 bytes per records instead of 24 ...

-rw-r--r-- 1 lch users 245325984 2011-10-26 15:58 ness0.db -rw-r--r-- 1 lch users 918359689 2011-10-26 15:58 ness0.idx -rw-r--r-- 1 lch users 245150720 2011-10-26 15:58 ness10.db -rw-r--r-- 1 lch users 918612711 2011-10-26 15:58 ness10.idx -rw-r--r-- 1 lch users 245293760 2011-10-26 15:58 ness11.db -rw-r--r-- 1 lch users 918339284 2011-10-26 15:58 ness11.idx -rw-r--r-- 1 lch users 257679456 2011-10-26 15:58 ness12.db -rw-r--r-- 1 lch users 917527165 2011-10-26 15:58 ness12.idx [...]

Hope you can continue to enhance this promising DB :)

BohuTANG commented 12 years ago

Hi, Yesterday I made ​​some code changes, mainly with the index file size(less disk space useage). 1) nessDB's bufferpool size is no effect on "random writes",it's just as for "read" hot data cache.nessDB has no bulk-write.That,when you write one entry, it is basically written to a disk (if the system does not cache it).

2) Write speed decrease quickly,may be due to B + Tree's "key"(in storage.h) is set too high (64bytes fixed length), and is now (32bytes fixed length).

3) DB size (.db+.idx) is too large,because each nessDB's "idx" structure as: {char sha1[SHA1_LENGTH]; be64 offset; be64 child;}

SHA1_LENGTH is 64(Now is 64bytes),so one "idx" sizes are:(64+8+8)=80bytes

nessDB's "db" structure as: {__be32 len; data;} so one "db" sizes are:(4+32)=36bytes,(data's length is power of 2).

This one can be optimized, use int32, not use int64.

4) nessDB write performance is indeed a problem, I also think of ways to optimize it, I believe there it will be nice. You can re-download the new code from github(not google code), and testing (bufferpool size to default), thank your feedback.

Hope more attention,because I optimize it every day.

BohuTANG

BohuTANG commented 12 years ago

Again,just made ​​some changes. Now you can re-download , modify the "NUM" macro as you want in "nessdb-bench.c" (the others keep as default), "make" and start the test. If possible, push the test results to me, very grateful.

Noticed your hard drive is SSD, it should be very fast for writing speeds. I just did in tens of millions of tests on my common PC.

rafal98 commented 12 years ago

Hi,

I redo about same test with your new code: it's really better. More faster and use less space on disk. I change the random key, because at my first attempt, it generate too many collisions (about half of the NUM set were really added). Speed is about 65 K insert/s instead of 29 K insert/s and DB size is 6.4 GB instead of 14.1 GB

Congratulation :)

Here the bench diff and the result

-#define VALSIZE 80 -#define NUM 2000000 +#define VALSIZE 24 +#define NUM 100000000 -#define BGSYNC (1) +#define BGSYNC (0)

Keys: 16 bytes each Values: 24 bytes each Entries: 100000000 IndexSize: 2288.8 MB (estimated) DataSize: 2670.3 MB (estimated)

BG SYNC: close...

nessDB: version 1.7(Multiple && Distributable B+Tree with Level-LRU,Background IO Sync) Date: Sun Oct 30 14:51:19 2011 CPU: 8 * Intel(R) Xeon(R) CPU E31245 @ 3.30GHz CPUCache: 8192 KB

+-----------------------+---------------------------+----------------------------------+---------------------+ |Random-Write (done:98881145): 0.000015 sec/op; 65516.8 writes/sec(estimated); 3.3 MB/sec; cost:1509.249(sec) +-----------------------+---------------------------+----------------------------------+---------------------+ lch@hyperstream:/DT/local/nessDB/src$ du -k ndbs/ 6688576 ndbs/

BohuTANG commented 12 years ago

HI, Nice,glad to hear this news,and SSD is powerful. But if you test on HDD,Random-Write performance is not very good as expected. I think improve nessDB's Random-Write on HDD is difficult(Maybe need to use other Index-Structure instead of B+Tree) It's a long term of this,but I will continuous "aza aza fighting"! Thanks again for your feedback.

BohuTANG

BohuTANG commented 12 years ago

Hi,bro The new codes on github now, random-write is 2X faster than before versions. I think I should close this issue.