AbsaOSS / enceladus

Dynamic Conformance Engine
Apache License 2.0
31 stars 14 forks source link

Conformance Rules - Tooltips #308

Open asheem opened 5 years ago

asheem commented 5 years ago

Background

Conformance Rules can be difficult to understand.

Feature

We can add tooltips to the rules which provide a concise explanation.

Proposed Solution [Optional]

Conformance Rules ToolTips
CastingConformanceRule Conversion of one input data type to another output data type.
ConcatenationConformanceRule Combine data contents from two or more input columns together in one output column.
DropConformanceRule Drop (or remove) a column from the input dataset so that it doesn't appear in the output.
LiteralConformanceRule Place a literal value in a specified output column.
MappingConformanceRule Lookup and join values from input and output columns to target columns.
NegationConformanceRule Convert numeric values from positive to negative, and vice versa.
SingleColumnConformanceRule Place a value from an input column in a struct, where the output column contains the name of the input field as an alias.
SparkSessionConformanceRule Extract values from the spark session conf_info file for use in the dataset (e.g. version).
UppercaseConformanceRule Transform input text values to uppercase.
Zejnilovic commented 5 years ago

I like it, but I am still unsure about LiteralConformanceRule and SingleColumnConformanceRule

GeorgiChochov commented 5 years ago

Here's an explanation of what these two rules actually do. @asheem I trust you can improve on my hasty wording:

For example, the following configuration:

Input Column: single_value
Output Column: conformed_struct
Input Column Alias: value

would result in the following output:

single_value conformed_struct
"asd" { value : "asd"}
asheem commented 5 years ago

Thanks guys, now updated.

GeorgiChochov commented 5 years ago

Place a value from and input column or alias in a struct to an output column.

It's actually not "input column or alias", it is the value of the input column placed as a field in a struct, where the name of the filed is the alias provided:

input_column output_column
"asd" { input_column_alias : "asd"}
asheem commented 5 years ago

Thanks Georgi, now updated.