The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
State iterator emits final state message (indicating that the snapshot is complete)
Airbyte proceeds to perform incremental syncs.
This causes data loss, as 2 should emit an intermediate state message. We've lost all the data since the first failure.
The fix is to change the logic :
If a stream snapshot has failed, emit the intermediate message OR fail
Throw an exception to prevent the sync from progressing further
This would be inline with how source-mongo & source-postgres deal with these failures. Furthermore, as SourceStateIterator will be used as the base class for emitting state counts for Postgres & Mongo, this behavior should be standardized.
Topic
No response
Relevant information
Context : https://airbytehq-team.slack.com/archives/C043JHEEYKG/p1707152897893279
We're seeing failures in source-mysql where :
This causes data loss, as 2 should emit an intermediate state message. We've lost all the data since the first failure.
The fix is to change the logic :
This would be inline with how source-mongo & source-postgres deal with these failures. Furthermore, as
SourceStateIterator
will be used as the base class for emitting state counts for Postgres & Mongo, this behavior should be standardized.