airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
14.79k stars 3.8k forks source link

[source-amazon-seller-partner] GET_FLAT_FILE_RETURNS_DATA_BY_RETURN_DATE schema validation errors #39502

Open zerobearing2 opened 3 weeks ago

zerobearing2 commented 3 weeks ago

Connector Name

source-amazon-seller-partner

Connector Version

4.3.1

What step the error happened?

During the sync

Relevant information

Trying to extract return report into Bigquery, data is dropped due to schema validation errors during sync. Here is the destination _airbyte_meta example:

{
  "changes": [
    {
      "change": "NULLED",
      "field": "Order date",
      "reason": "DESTINATION_TYPECAST_ERROR"
    },
    {
      "change": "NULLED",
      "field": "Return request date",
      "reason": "DESTINATION_TYPECAST_ERROR"
    },
    {
      "change": "NULLED",
      "field": "Return delivery date",
      "reason": "DESTINATION_TYPECAST_ERROR"
    },
    {
      "change": "NULLED",
      "field": "SafeT claim creation time",
      "reason": "DESTINATION_TYPECAST_ERROR"
    }
  ],
  "sync_id": 0
}

Relevant log output

2024-06-14 15:51:37 platform > Schema validation errors found for stream _GET_FLAT_FILE_RETURNS_DATA_BY_RETURN_DATE. Error messages: [$.Order date: 16-Mar-2024 is an invalid date-time, $.Order date: 17-Mar-2024 is an invalid date-time, $.Order date: 01-May-2024 is an invalid date-time, $.Return request date: 11-Jun-2024 is an invalid date-time, $.Return request date: 13-Jun-2024 is an invalid date-time, $.Order date: 16-Apr-2024 is an invalid date-time, $.Return request date: 12-Jun-2024 is an invalid date-time, $.Order date: 02-May-2024 is an invalid date-time, $.Order date: 06-May-2024 is an invalid date-time, $.Order date: 18-Apr-2024 is an invalid date-time, $.Order date: 21-May-2024 is an invalid date-time, $.Order date: 30-Apr-2024 is an invalid date-time, $.Order date: 31-May-2024 is an invalid date-time, $.Return request date: 14-Jun-2024 is an invalid date-time, $.Order date: 11-Mar-2024 is an invalid date-time, $.Order date: 12-Mar-2024 is an invalid date-time, $.Order date: 18-May-2024 is an invalid date-time, $.Order date: 15-May-2024 is an invalid date-time, $.Order date: 09-Jun-2024 is an invalid date-time, $.Order date: 08-Jun-2024 is an invalid date-time, $.Order date: 07-Jun-2024 is an invalid date-time, $.Order date: 19-May-2024 is an invalid date-time, $.Order date: 21-Apr-2024 is an invalid date-time, $.Order date: 06-Jun-2024 is an invalid date-time, $.Order date: 04-Jun-2024 is an invalid date-time, $.Order date: 05-Jun-2024 is an invalid date-time, $.Order date: 02-Jun-2024 is an invalid date-time, $.Order date: 03-Jun-2024 is an invalid date-time, $.Order date: 24-Apr-2024 is an invalid date-time, $.Order date: 01-Jun-2024 is an invalid date-time, $.Order date: 27-Apr-2024 is an invalid date-time, $.Order date: 13-Jun-2024 is an invalid date-time, $.Order date: 10-Jun-2024 is an invalid date-time, $.Order date: 22-Mar-2024 is an invalid date-time, $.Order date: 24-May-2024 is an invalid date-time, $.Order date: 12-Jun-2024 is an invalid date-time, $.Order date: 11-Jun-2024 is an invalid date-time, $.Order date: 25-May-2024 is an invalid date-time, $.Order date: 26-May-2024 is an invalid date-time, $.SafeT claim creation time:   is an invalid date-time, $.Order date: 29-May-2024 is an invalid date-time, $.Return delivery date:   is an invalid date-time, $.Order date: 28-May-2024 is an invalid date-time]

Contribute

marcosmarxm commented 2 weeks ago

Thanks for reporting the issue @zerobearing2 probably it will need to implement a transformer operation to this report update the datetime format for that field. Something similar to: https://github.com/airbytehq/airbyte/blob/c7ecc41317cc93dc697821a54154ffad0676e2c1/airbyte-integrations/connectors/source-amazon-seller-partner/source_amazon_seller_partner/streams.py#L717-L727

zerobearing2 commented 2 weeks ago

it will need to implement a transformer operation to this report update the datetime format for that field

@marcosmarxm, any chance you have possible ETA on fix? This is actually blocking us atm. We had to implement workaround by extracting to S3, then creating separate source/destination to get files loaded to Bigquery to avoid the schema issues during loads.