The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
extract the "run sync until state ack" into a method on IntegrationTest
add an option to RecordDiffer to allow unexpected records (since we don't know how many records will actually get flushed as part of the state ack)
switch the mid-sync checkpointing test, and the truncate tests, to use the new allowUnexpectedRecords option
this PR also fixes a bug in the RecordDiffer (it wasn't correctly sorting records, so tests could fail nondeterminstically).
legacy CDK connectors always flushed all pending data, even on a stream INCOMPLETE. The new CDK has stricter protocol compliance, in that it doesn't make any guarantees about pending work, it only guarantees that records prior to an acked state message are persisted.
So tests from the old CDK don't directly work on bulk CDK connectors. This PR updates these tests to force a state ack (by pushing a ton of records to force a flush). This makes the tests take longer to run, which we should improve at some point in the future (https://github.com/airbytehq/airbyte-internal-issues/issues/10911).
closes https://github.com/airbytehq/airbyte-internal-issues/issues/10413. This is heavily based on https://github.com/airbytehq/airbyte/pull/48583, with some refactors:
allowUnexpectedRecords
optionthis PR also fixes a bug in the RecordDiffer (it wasn't correctly sorting records, so tests could fail nondeterminstically).
legacy CDK connectors always flushed all pending data, even on a stream INCOMPLETE. The new CDK has stricter protocol compliance, in that it doesn't make any guarantees about pending work, it only guarantees that records prior to an acked state message are persisted.
So tests from the old CDK don't directly work on bulk CDK connectors. This PR updates these tests to force a state ack (by pushing a ton of records to force a flush). This makes the tests take longer to run, which we should improve at some point in the future (https://github.com/airbytehq/airbyte-internal-issues/issues/10911).