risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.82k stars 566 forks source link

Tracking: Support ColumnFamily #79

Closed twocode closed 2 years ago

twocode commented 2 years ago

Bummock will be built on Hummock to support relational/semi-structured/shcemaless data models. It will map base Table/Collections to Hummock's ColumnFamilys with abilities like:

  1. Physically parition base tables for security and tenancy.
  2. Compactions are scoped into a ColumnFamily.
  3. Support cross-table transactions.
  4. Variants of ColumnFamilys will be optimized for internal layouts of structural, semistructural, schemaless data, perspectively.

The difference between existing KeySpace is that Keyspace is just a natural colocated keyrange w/o functions above. ColumnFamily is an enhancement on top of that.

fuyufjh commented 2 years ago

What is this for?

soundOfDestiny commented 2 years ago

What is this for?

Schemaless Bummock?

fuyufjh commented 2 years ago

What is this for?

Schemaless Bummock?

I am totally confused. Can you explain in more detail?

twocode commented 2 years ago

I have updated the descriptions. Please CIL. @fuyufjh @soundOfDestiny

fuyufjh commented 2 years ago
  1. Physically parition base tables for security and tenancy.
  2. Compactions are scoped into a ColumnFamily.
  3. Support cross-table transactions.
  4. Variants of ColumnFamilys will be optimized for internal layouts of structural, semistructural, schemaless data, perspectively.

If I understand correctly, ColumnFamily means supporting multiple LSM-Trees in Hummock, is it? So my question is, can we just create multiple instances of HummockStorage to achieve that purpose?

twocode commented 2 years ago

can we just create multiple instances of HummockStorage to achieve that purpose?

We can, but with bunch of costs. Current HummockStorage is just a handle of a shared storage. If we want to deploy one HummockStorage per base table. That would mean more mapping rules in meta/catalog, less manageable, and more tedious to support features like multi-HummmockStorage transaction.

xxchan commented 2 years ago

seems stale