heavyai / heavydb

HeavyDB (formerly OmniSciDB)
https://heavy.ai
Apache License 2.0
2.94k stars 446 forks source link

compute and storage separation #720

Open gunlinan opened 2 years ago

gunlinan commented 2 years ago

Seeking compute and storage separation, we looked at omniscidb Foreign Storage and Caching function but realized that the storage type is immutable so not much useful to us.

We also looked at Clickhouse hot_and_cold storage policy but worry about its asynchronous data synchronization thus potential consistency issue. This path also complicates storage cost structure.

Can omniscidb add mutability to foreign storage and/or add the separation to native storage type? The feature is important to scalability and high-availability of cloud-native ephemeral-container-based OLAP services.

gunlinan commented 2 years ago

The separation should apply to string dictionaries as well. That can enable distributed processing of multiple dictionaries across all omniscidb servers. Compared with single dedicated dictionary server, this approach boosts elasticity without SPF.

cdessanti commented 2 years ago

HI @gunlinan,

We are working on storage and compute separation, and on the evolution of distributed architecture, that is more flexible in both elasticity and resiliency. The partition of dictionaries it's being considered, but I cannot say when all those features will land on our code.

When I will get some news, I will get back to you here.

Candido

gunlinan commented 2 years ago

Is the separation also going to happen in open source version?

cdessanti commented 2 years ago

I'm not sure of it; I will be more precise when everything will be precisely defined