objectbox / objectbox-go

Embedded Go Database, the fast alternative to SQLite, gorm, etc.
https://objectbox.io
Apache License 2.0
1.07k stars 46 forks source link

Scalability #8

Closed brockelmore closed 5 years ago

brockelmore commented 5 years ago

Hi all,

I know this is targeted for embedded systems, but I like the idea of object storage. It makes my life easier. That being said, what kind of performance hits will I experience as the database scales (say, to 350GB+)?

Thanks!

greenrobot commented 5 years ago

ObjectBox has a scalable technology as its foundation (e.g. B+ trees). To our experience it just works fine with multi GB data sets. Do you have a test data set? We'd be very curious about the results... :smiley: Let us know if you need any support there.

greenrobot commented 5 years ago

Closing this as we got no further response so far. Feel free to reopen when you have new info to share or questions.

brockelmore commented 5 years ago

Sorry - my dataset is Bitcoin, with some custom features and what not that bump it up to 350Gb. Easily downloadable, and has Go libraries that can be used to build the objects. I turned to a rust-based blockchain specific database implementation mainly due to being extremely narrow use case. But you guys have some mindshare in me now and in the future I will look to you guys!

greenrobot commented 5 years ago

Thanks, @brockelmore for your update. Can you share more about the "blockchain specific database implementation"?

brockelmore commented 5 years ago

Yep, its here, Hammersbald. The nature of blockchain is that it is append only*, and our database should leverage that fact. Additionally, it is a key-value store that maps keys to pointers into the append only store. Data is never deleted. Data structures can be optimized for insert and retrieve operations, we do not need to care how expensive it was to amend them. Keys do not define a meaningful order. Popular key-value stores use structures that allow key iteration in a defined order. We do not need that, therefore may use a simple hash table instead of a search tree. The data has internal references e.g. to previous block. We want to follow those references directly without a key-value lookup.

A note on append only, this is excluding reorganizations of blocks, which happen only at the tip of the blockchain and with nearly 100% certainty cannot occur deeper than 6 blocks.

Another alternative, that is not truly a database but rather a collection of databases and flat files that leverages similar aspects of blockchains is this project: BlockSci.