The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The Stripe API doesn't support retrieving expandable fields through the Events API. This makes it impossible to construct the full data in incremental sync without retrieving the full objects. This may imply that existing data is updated with null when writing in the destination. One example of this expandable fields in the API is the Price.tiers object.
Reproducing the issue
Create a new pricing object with billing_scheme = tiered, and fill the required fields in the Tiers object. Then query for events related to the newly created Price. Results don't include tiers information.
Desired situation
Connector logic to retrieve updated expandable fields in incremental sync is implemented. We update multiple objects from a single API call if possible for performance. We make sure expandable fields don't override values with null in destinations when not present.
Proposed solution
We keep track of which fields are expandable throughout the API. Objects with expandable data that also have update events are queried to populate all fields when doing incremental sync.
Expected outcome
Data is complete and up-to-date even when expandable data is present in API objects. Test end to end to check no wrong nulls are written. As we will perform one HTTP query per record, we expect this to slow the syncs but don't have a good measure of how much.
Based on this information, it is assumed that the following fields are missing during incremental syncs:
Stream: charges, field: refunds (this one actually works even though Stripe have this field as expandable. I'll contact them to get more information on this one)
Stream: checkout_sessions_line_items, fields: discounts and taxes (this one is fine because we don't retrieve line items through the events API, only the parents)
Stream: plans, field: tiers (running full_refresh, I have the information but not running incremental)
Current situation
The Stripe API doesn't support retrieving
expandable
fields through the Events API. This makes it impossible to construct the full data in incremental sync without retrieving the full objects. This may imply that existing data is updated withnull
when writing in the destination. One example of this expandable fields in the API is the Price.tiers object.Reproducing the issue
Create a new pricing object with
billing_scheme = tiered
, and fill the required fields in the Tiers object. Then query for events related to the newly created Price. Results don't include tiers information.Desired situation
Connector logic to retrieve updated
expandable
fields in incremental sync is implemented. We update multiple objects from a single API call if possible for performance. We make sureexpandable
fields don't override values with null in destinations when not present.Proposed solution
We keep track of which fields are
expandable
throughout the API. Objects with expandable data that also have update events are queried to populate all fields when doing incremental sync.Expected outcome
Data is complete and up-to-date even when expandable data is present in API objects. Test end to end to check no wrong nulls are written. As we will perform one HTTP query per record, we expect this to slow the syncs but don't have a good measure of how much.