dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.65k stars 176 forks source link

Show offending item in logs when schema_contract is set to "freeze" #1772

Open akelad opened 2 months ago

akelad commented 2 months ago

Feature description

When using schema_contract="freeze", if a pipeline load fails because a new record is loaded that doesn't match the original type of the column, the offending record/item should be printed in the logs so that the user can figure out what to do about it.

Are you a dlt user?

Yes, I'm already a dlt user.

Use case

Right now if you set the schema_contract to freeze and the pipeline is trying to load a new record that has a different type you get a log along the lines of:

Trying to create new variant column something__v_text to table my_table but data_types are frozen..

Even with debug mode, dlt doesn't show you which item has caused this, which make sit very hard to debug.

Slack convo: https://dlthub-community.slack.com/archives/C04DQA7JJN6/p1724789620799849

Proposed solution

No response

Related issues

No response