Open ryscheng opened 7 months ago
We can also consider Apache Iceberg instead of Delta Lake.
This would be for data upstream from the events table.
We should try to preserve the benefits we get currently from BigQuery:
Ever since we solved https://github.com/opensource-observer/oso/issues/821
It's an open question now whether we should move more of our datapipeline to sqlmesh + Trino + Iceberg, instead of dbt + BigQuery. This issue can track that work
What is it?
https://delta.io/
Why? https://www.cidrdb.org/cidr2021/papers/cidr2021_paper17.pdf
Happy to stick with BigQuery public data sets for now until this becomes a stronger need.