dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.8k stars 1.62k forks source link

[CT-1952] [Feature] Provide additional Jinja tests on top of the built-in ones #6794

Open b-per opened 1 year ago

b-per commented 1 year ago

Is this your first time submitting a feature request?

Describe the feature

Jinja comes with built-in tests, but it is also possible to create custom ones.

When inspecting objects with Jinja in dbt a few extra tests could be useful to make the code shorter and more readable.

The following code inspects the graph to search for tests that depend on at lease one node.

{{ graph.nodes.values() 
    | selectattr("config.materialized","==","test") 
    | selectattr("depends_on.nodes") 
    | list | tojson(indent=1)
}}

If I want to search for tests applying to a given node I then need to loop through the results. If we had a test called contains, we could do something like

{{ graph.nodes.values() 
    | selectattr("config.materialized","==","test") 
    | selectattr("depends_on.nodes","contains","model.project.my_model") 
    | list | tojson(indent=1)
}}

which would return the list of tests that are applied to my_model.

The list of tests I can think of for now would be:

Describe alternatives you've considered

We can already achieve all the outcome by writing longer and more complex Jinja code. This feature would make it

Who will this benefit?

People writing Jinja macros as part of packages or custom logic

Are you interested in contributing this feature?

Yes

Anything else?

No response

dbeatty10 commented 1 year ago

The tests you described sound handy @b-per.

I haven't tried the built-in test named in. Does it behave the same way as your proposed contains?

b-per commented 1 year ago

It is actually the opposite.

selectattr() requires us putting the Jinja selector as the first argument so we actually can't use in for the case 2nd case

dbeatty10 commented 1 year ago

👍 Thanks for explaining @b-per.

Who would you see as the primary users of these new Jinja tests? Analytics engineers in their own dbt projects? Or would these be more for dbt package maintainers or dbt-core developers? I'm guessing this came up in the context of dbt-project-evaluator?

As an aside, I wonder how much conceptual overlap there is here with GPML? GPML is the graph pattern matching sub-language by WC3 that is the core of both SQL/PGQ and GQL which I think are scheduled to be published this year as part of SQL:2023 (Part 16 of ISO/IEC 9075).

b-per commented 1 year ago

My need mostly came from the want to analyze the graph object from the IDE to get more familiar with a project I didn't know. For example, today, the graph is the easiest way to know what is the materialization of a model for example without looking at whether the materialization has been defined in the model itself, or in the YML for the model, or at some level in dbt_project.yml.

Saying so, I believe that if it was available it would:

  1. make it easier to write packages introspecting the dbt Jinja objects
  2. make it easier for people to write on-run-start/end hooks
  3. potentially allow for more explicit code in dbt-core (I would need to look at some code to see if there is some refactoring opportunity)

I am not too sure about the conceptual overlap. This looks to be more applicable to data and graph db when this issue feels more about a pure Jinja topic.

dbeatty10 commented 1 year ago

I am not too sure about the conceptual overlap. This looks to be more applicable to data and graph db when this issue feels more about a pure Jinja topic.

In terms of conceptual overlap, I just meant to highlight that it looks like you are trying to query graph using Jinja as the programming language Jinja.

And I wanted to call out that it sounds like GPML is a graph pattern matching sub-language that might be a published standard sometime this year. Obviously it won't be something we can use in the near-term, but might be an option for similar use cases in the long term.

dbeatty10 commented 1 year ago

@jtcohen6 could you give your thoughts about the proposal of adding the following Jinja custom tests to those that are built-in?

See below for a quick summary of pros/cons.

Pros

Cons

github-actions[bot] commented 8 months ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

b-per commented 8 months ago

Commenting because I am still keen to see this implemented!