Open homar opened 2 years ago
It's not in the protocol definition
what should go into this field then?
cc @vkorukanti
@findepi These are the operation metrics for each operation. Let me get back to you on whether these should be part of the Protocol.
.. whether these should be part of the Protocol.
cc @claudiusli
also cc @alexjo2144 @ilfrin
@vkorukanti any new thoughts on this w.r.t. https://databricks.com/blog/2022/06/30/open-sourcing-all-of-delta-lake.html ?
Apologies for not getting back on time. The Delta-on-Spark opensource project already has metrics defined here written as part of the commit. Regarding whether they should be part of the protocol: ideally they should be, we haven't documented them yet. These are evolving frequently based on the need. Also these metrics are currently a bag of json fields, so any implementation expected to handle missing fields or extra fields.
The operation metrics are also listed in the $history
metadata table
https://trino.io/docs/current/connector/delta-lake.html#history-table
Delta Lake has a commit field called operationMetrics that had some statistics on the rows deleted. It's not in the protocol definition but it could be useful to include. See DeltaLakeMetadata