datafold / data-diff

Compare tables within or across databases
https://docs.datafold.com
MIT License
2.93k stars 262 forks source link

Support automatic sampling in --dbt (non --cloud) #520

Closed dlawin closed 1 year ago

dlawin commented 1 year ago

When diffing locally I would like to diff against sampled data. When a table is large this would increase performance considerably, and would still give me an understanding of the changes.

This will be a larger lift, but should work interchangeably with the --cloud feature described here: https://github.com/datafold/data-diff/issues/519

github-actions[bot] commented 1 year ago

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

mimoyer21 commented 1 year ago

Would still love this feature, so commenting to keep this issue open.

dlawin commented 1 year ago

Would still love this feature, so commenting to keep this issue open.

Agreed, I think I need to add a tag to make some issues immune to the stale bot

github-actions[bot] commented 1 year ago

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.