dbt-labs / dbt-jsonschema

Apache License 2.0
117 stars 40 forks source link

[unit testing] unit testing versioned models #121

Open graciegoheen opened 7 months ago

graciegoheen commented 7 months ago

add support for yml that adds a unit test to a specific version or versions of a model - spec outlined here.

joellabes commented 6 months ago

@graciegoheen I have been trying to make this work, but with the current spec it's not possible to represent in JSON Schema 😔

Consider these examples:

unit_tests:
  - name: test_normal_model
    model: unversioned_model
  - name: test_is_valid_email_address
    model: dim_customers
      versions:
        include: 
          - 2

The former is represented simply:

/*skipping the truly wild amount of boilerplate it takes to set up a JSON Schema */
...
"properties": {
  "model": {"type": "string"}
}
...

model is a string and its type is easy to set.

For the latter, there needs to be a versions object, which is easy enough:

...
"properties": { 
  "model": {
    "type": "object",
    "properties": {
      "versions": {
        "type": "object",
        /*skipping the truly wild amount of brackets it would take to 
        spell out the include/exclude stuff inside this object, but it's doable*/
      }
    }
  }
}

but there also needs to be a place to put the name of the model - now that its type is set as an object, it can't also be a string.

So to be able to validate this, we'd need to have it look more like this:

unit_tests:
  - name: test_normal_model
    model: unversioned_model #note that this can stay as-is, we can say "either a string or an object"
  - name: test_is_valid_email_address
    model: 
      model_name: dim_customers #break this out to its own key
      versions:
        include: 
          - 2

If we don't want to change the spec, then we would just have to change the model block in the JSON Schema to accept whatever input it gets. In this case we would lose the autocomplete and YAML validation for that block, but at least it wouldn't incorrectly mark a versioned model as invalid for not being a string.

JSON Schema is a very hard language to explain so I'm happy to sync with you live on this, or here's my buddy chatgpt working through the options with me too: https://chat.openai.com/share/e/04bba711-8488-443b-851c-68249285ba47