airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.55k stars 4.01k forks source link

Something is wrong with how we get/store versions with metadata overrides in the platform #29553

Closed erohmensing closed 11 months ago

erohmensing commented 1 year ago

We're writing a LOT of ADVs for salesforce: Private Zenhub Image

Those are supposed to be unique. And once it exists we shouldn't have to retrieve it anymore.

There are two issues here:

  1. Fix whatever led to us even trying to create the duplicate - likely has to do with how overrides are handled on the registry side
  2. Fix the unique constraint / fixup the data

Re: Number 1: this is specifically happening because we attempt to fetch the registry entry for version 2.1.1. This contains a version override for cloud which is 2.0.9 (So I guess at the time 2.1.1 was released, cloud was still pinned to 2.0.9?). This means that the cloud registry entry for 2.1.1 is actually the entry for 2.0.9 (I think? Not sure if the metadata comes from 2.0.9, or is just the metadata for 2.1.1 with the image tag 2.0.9).

Before we do that, we do a lookup for 2.1.1 (which doesn't exist in the DB), which is why we keep hitting the registry to look for it. We then persist that 2.0.9 version (which shouldnt be possible, see (2)) again and again. But our "check if version exists before grabbing and persisting" was not prepped to handle this override behavior.

While this setup is expected behavior (probably) for creating the registry itself, its definitely not what we want when retrieving registry entries for a specific version.

https://airbytehq-team.slack.com/archives/C03VDJ4FMJB/p1692296267932739

erohmensing commented 1 year ago

This is probably a "the registry entry should always be for the version, and when we create the actual registries we should take into account the overrides" or something of the sort

A registry entry for 2.1.1 should not have a dockerImageTag of 2.0.9

evantahler commented 1 year ago

Grooming:

The problem:

Fix:

Sins:


https://console.cloud.google.com/storage/browser/prod-airbyte-cloud-connector-metadata-service/metadata/airbyte/source-salesforce/2.1.1;tab=objects?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&prefix=&forceOnObjectsSortingFiltering=false

Private Zenhub Image

erohmensing commented 1 year ago

I believe fixing this sin would also let us undo "never upsert ADVs" which we introduced to patch this but is causing issues in the dev workflow

erohmensing commented 1 year ago

Things we want to support:

Happy path (no platform overrides)

Considering rollbacks:

Open question:

erohmensing commented 1 year ago

Potential to include a ref underneath oss/cloud: refs do not have to be the same as overrides

erohmensing commented 11 months ago

This should be closed in https://github.com/airbytehq/airbyte/pull/30699 and https://github.com/airbytehq/airbyte-platform-internal/pull/9000

Captured feedback for metadataspec v2 in this issue.

erohmensing commented 11 months ago

Follow up: ok i guess we can upsert ADVs