krotik / eliasdb

EliasDB a graph-based database.
Mozilla Public License 2.0
998 stars 49 forks source link

benchmark? #5

Closed thealex42 closed 8 years ago

thealex42 commented 8 years ago

How fast are lookups? What is a complexity?

krotik commented 8 years ago

Hi,

Hmm, how long is a piece of string? :-) The answer is: the lookup speed depends on your setup, your configuration (your requirements) and how you tune the low-level storage. There are some configuration options which are hardcoded and not exposed in the actual config file. You have in the storage package for example a storage.CachedDiskStorageManager which keeps a portion of stored objects in memory. Or you could use a memory only storage with will always be orders of magnitudes faster than any disk storage solution. On the lowest level you could tweak the size of the transaction log file. Nodes are stored per attribute so the lookup speed also depends on how many attributes you query per node. There is also a full text index which lets you lookup node keys via phrases very quickly.

A lookup query for a particular node has a few layers to go through. Have a look at /doc/elias_db_design.md for details.

Once I had finished the low-level storage I've compared the Go code (Windows and Linux builds) with a similar Java implementation (only Windows) which I did years ago. The times are in milliseconds for storing short strings ("test-1", "test-2", ...). Initial insert, lookup, update and another lookup.

go1_6_linux_amd64

Strings,1000,Insert,19,Fetch,3,Update,17,Fetch,3
Strings,6000,Insert,65,Fetch,22,Update,55,Fetch,23
Strings,11000,Insert,103,Fetch,42,Update,91,Fetch,43
Strings,16000,Insert,138,Fetch,62,Update,127,Fetch,64
Strings,21000,Insert,177,Fetch,82,Update,165,Fetch,84
Strings,26000,Insert,215,Fetch,101,Update,203,Fetch,103
Strings,31000,Insert,253,Fetch,122,Update,236,Fetch,124
Strings,36000,Insert,291,Fetch,144,Update,272,Fetch,142
Strings,41000,Insert,327,Fetch,163,Update,307,Fetch,162
Strings,46000,Insert,364,Fetch,182,Update,343,Fetch,180

go1_6_win_amd64

Strings,1000,Insert,128,Fetch,12,Update,146,Fetch,5
Strings,6000,Insert,216,Fetch,44,Update,189,Fetch,49
Strings,11000,Insert,321,Fetch,99,Update,235,Fetch,84
Strings,16000,Insert,397,Fetch,105,Update,267,Fetch,100
Strings,21000,Insert,344,Fetch,130,Update,341,Fetch,206
Strings,26000,Insert,408,Fetch,174,Update,371,Fetch,194
Strings,31000,Insert,516,Fetch,212,Update,404,Fetch,213
Strings,36000,Insert,512,Fetch,270,Update,454,Fetch,231
Strings,41000,Insert,524,Fetch,304,Update,542,Fetch,277
Strings,46000,Insert,724,Fetch,318,Update,662,Fetch,293

java7_win_64

Strings,1000,Insert,161,Fetch,28,Update,190,Fetch2,10
Strings,6000,Insert,141,Fetch,12,Update,155,Fetch2,11
Strings,11000,Insert,171,Fetch,22,Update,170,Fetch2,18
Strings,16000,Insert,161,Fetch,24,Update,206,Fetch2,26
Strings,21000,Insert,191,Fetch,26,Update,207,Fetch2,30
Strings,26000,Insert,201,Fetch,37,Update,188,Fetch2,30
Strings,31000,Insert,226,Fetch,35,Update,239,Fetch2,45
Strings,36000,Insert,252,Fetch,33,Update,283,Fetch2,40
Strings,41000,Insert,275,Fetch,37,Update,271,Fetch2,39
Strings,46000,Insert,307,Fetch,50,Update,326,Fetch2,42
thealex42 commented 8 years ago

hi, thanks for the detailed answer!

my case is that I have N objects each with some fixed number of attributes. I need to get all objects that have defined attributes. like: Car { color: "Red", Transmission: "auto", Year: 2010} Car { color: "Red", Transmission: "manual", Year: 2010} Car { color: "Blue", Transmission: "auto", Year: 2010} query: get all cars with color=Red and Year=2010. I need fixed latency <1ms and > 50k RPS for such queries.

Actual objects contains 15-20 attributes, each attribute has up to 300 values and number of objects is relatively low, maybe 1000-3000. Objects are not related to each other.

Currently looking for a efficient data structure for it and wanted to try graph database. Wouldn't it be overkill for this task?

krotik commented 8 years ago

Hey,

wow, getting a response in under 1ms is quite a tight requirement. I would definitely keep the whole data set in memory and embed the data store. If the values are exact matches you could use the full text index which should be able to directly lookup the required list of matches.

More elaborate conditions can be expressed with EQL but you won't get a response in under 1ms.

I am not sure EliasDB is the right choice for this - the more features you add the slower it gets. The embedded memory only version with index lookups might work ...

thealex42 commented 8 years ago

Well, < 10ms will be ok too. Anyway I will try eliasdb and maybe some bitmap indexes in memory. Thanks!