orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.72k stars 869 forks source link

Have you ever thought of integrating OrientDB as a node level store for Apache Spark? #10170

Open MironAtHome opened 4 months ago

MironAtHome commented 4 months ago

I have pulled this project out of a thin air and thought that columnar datastore can be one upped by graph data store, and on top of it with indexes ( distributed ). The cool part is it's just an optimizer plug in to have it to generate accessors for OrientDB instead of parquet. I think we can push GPU's out the window, for majority of the projects and finally do most of computations in memory. And I don't mean going delta architecture. Since JVM can use all the memory on the node ( and OrientDB is thread friendly, so, it can safely multi-task ).

tglman commented 4 months ago

Hi,

This is probably possible, and quite interesting, on the other end is quite out of the scope of the OrientDB project itself, in the meaning that we would happily help someone else to implement it, but not do it ourselves.

Regards