replikativ / datahike-postgres

Datahike with Postgres as data storage
Eclipse Public License 1.0
17 stars 2 forks source link

An interesting option would be to use PostgreSQL with a foreign data wrapper + RocksDB #7

Closed ieugen closed 4 years ago

ieugen commented 4 years ago

Hello,

I think this solution has potential because it provides a single interface for accessing two types of data stores: relational and key value.

PostgreSQL has added support for custom storage engines so things should improve in the future. In the mean time, the foreign data wrapper API allows you to use other types of storage in PostgrSQL.

The people from VidarDB created an extension to use RocksDB (LevelDB fork) .

https://github.com/vidardb/PostgresForeignDataWrapper

whilo commented 4 years ago

Hey Eugen,

we use the backend stores, similarly to Datomic, mostly to store key-value blobs of serialized edn of our own persistent index. This is sensible because these environments do not naturally support its extended semantics. In that sense we would not gain a lot by adding another blob storage layer, but it would increase complexity and dependencies needed for deployment. I might misunderstand your point though. What would you need this combination for?

ieugen commented 4 years ago

Thanks for your feedback, I'm new to the ecosystem and looking to understand the use case better.
I appreciate the taking the time to answer.

I know is that most applications benefit from both types of storage: key store and relational. A lot of times it's very useful to have persistent projections that are updated periodically. Also, there are a lot of features built into PostgreSQL that might be useful when building an app. Since datahike has a leveldb implementation it might work the same in rocksdb.

Writing in the key value store via the SQL / jdbc API should provide some benefits like the ability to access a RocksDB database from multiple users via SQL (PostgreSQL is managing the connection). Performance wise it should be very close to accessing RocksDB directly, but did not check ?!

How would you deal with things like persistent projections in datahike ?

whilo commented 4 years ago

What do you mean with persistent projection?

Many things can be useful, but I think the difference between key value store and relational store is maybe not as fundamental as you think. In fact we have a document store (konserve), which is a key value store underlying our persistent hitchhiker-tree indices which are used to infer result sets of queries on these indices in Datahike. All these pieces already fit together.

ieugen commented 4 years ago

Thanks, I will try it out and see how it goes.

Regarding the projection:

A projection is derived data based on the data in the stream. It could be a count for example. Instead of counting all of the things every time, you could build and maintain a projection that processes the events and persists the data so it's available for query.

I'm referring to this concept https://domaincentric.net/blog/event-sourcing-projections

whilo commented 4 years ago

Ah ok, that is often referred to as a materialized view in the database community. I will close that for now, feel free to open more issues, if you have specific concerns.