sdf-labs / tests

Standard testing library for SDF data tests
Apache License 2.0
7 stars 1 forks source link

Add foreign_key test #6

Open dbrtly opened 1 month ago

dbrtly commented 1 month ago

Summarize the feature in question. there should be no distinct values in a specific column in this table that do not also exist as the distinct values of my_table.fk_column

columns:
  - name: a
    tests:
      - expect: foreign_key("db_a.schema_a.model_a", "column_a")
  - name: b
    tests:
      - expect: foreign_key("schema_b.model_b", "column_b")
  - name: c
    tests:
      - expect: foreign_key("model_c", "column_c")

Validates: Referenced table.column has a primary_key() sdf test No values exist in model.column that do not also exist in the referenced model.column

What team(s) will benefit from this feature request? Although the OneBigTable approach sometimes means fewer foreign key relationships, foreign keys remain a common design pattern in the analytics wild. Many databases cannot enforce foreign key constrains natively.

Is this an enhancement to an existing feature? If so, briefly describe the update. Data-driven implementation to identify rows in a staging table where the foreign key relationship is invalid, before data is loaded from the staging table to the target table.

Describe alternatives you've considered if applicable. Some storage engines can enforce constrains similarly to traditonal relational databases. Exceptions exist (notably BigQuery)..

Additional context Will be happy to contribute code.