open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.57k stars 1.05k forks source link

Create reusable Test Cases / Test Case Template. #10530

Open DovileKr opened 1 year ago

DovileKr commented 1 year ago

Is your feature request related to a problem? Please describe. Data Quality. Currently I have to create a test case for each table or column even if I am testing for the same things. For example:

  1. Email or Date format with Regex
  2. Columns across different systems and tables only containing values of valid customer types. To have to recreate the same test logic for the identical test case is time consuming.

Describe the solution you'd like It would be very convenient to be able to set Template Test Cases and then simply chose like

  1. must - instead of selecting a test definition one can select a predefined test case template "email format" or "customer group".
  2. should - from a test case template - choose columns/tables.
  3. nice - select tables and columns based on their tags. For example tag columns with "email" and then one is able to run test case template across all columns tagged - "email" or columns tagged with a glossary term "customer group".

One added benefit of this would be being able to track the DQ of the same logical attribute across systems. For example: to see the quality of "customer group" attribute, I could look into the quality results for the template test cases, rather than individual test cases. Even if they are done as part of different Test Suites. Really good view for Data Stewards.

Describe alternatives you've considered Just an idea

Additional context Add any other context or screenshots about the feature request here.

DovileKr commented 9 months ago

hello, do we have any plans for this enhancement?

bgalvao commented 8 months ago

+1

In my case we are reconciliating 2 different sources for a machine learning model training and inference respectively. The two tables will have different lineages, but we want them to conform to the same suite of checks / tests / constraints.

Having reusable test cases - or being able to reflect the tests of one table onto another - would be a big helper for us.