apache / polaris

Apache Polaris, the interoperable, open source catalog for Apache Iceberg
https://polaris.apache.org/
Apache License 2.0
1.13k stars 124 forks source link

Notification API reject out of order notifications #232

Closed tzuan16 closed 2 months ago

tzuan16 commented 2 months ago

Description

UPDATE/CREATE notifications will be rejected if the timestamp is older than the latest processed for the table. DROP notifications won't be affected since messing up the order of DROPs is acceptable as they are idempotent.

Update the notification API spec to clarify that the API caller should ensure the correct order of timestamps for a sequence of notifications. Also added the description for 409 conflict error.

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Test Configuration:

Checklist:

Please delete options that are not relevant.

tzuan16 commented 2 months ago

I am a little concerned about clock skew across multiple senders with this approach though. Rather than timestamp, would it be possible for notification senders to use something like an ordering key, and potentially for them to obtain it from Polaris?

Discussed with Eric offline, we leave it to the notification sender to ensure the correct order of timestamps for a sequence of notifications. This is due to Polaris serving as the "external" catalog, where the single source of truth should come from the actual catalog that "owns" the table. We don't want the proprietary catalog to have to speak with Polaris first prior to sending the notification.

RussellSpitzer commented 2 months ago

Thanks @eric-maynard and @tzuan16 , set to auto-merge on tests passing