Open anhqle opened 2 weeks ago
Hmm, that would mean the column from the dim_node is required and so it's reasonable to require setting dim_node as a required dimension beforehand right? That would help limit the validation so we don't have to search the entire dimension dag to find this column. We could just look at the attributes (columns) on the dimensions found in the required dimensions for that metric.
Which brings me to another question. What if multiple dimension nodes have that particular column name? Do we just fail validation or SQL generation and raise some error indicating an ambiguous value in the metric definition?
that would mean the column from the dim_node is required and so it's reasonable to require setting dim_node as a required dimension beforehand right?
If by required dimension
, you mean colloquially that the dim node must be joined for the metric to be computable, then yes, it is required.
If by required dimension
you mean the DJ-definition i.e. "a grain at which the metric has to be consumed" (this name is frankly still unintuitive to me...), then no. In the engagement_on_android
example, that is a metric that can be aggregated at various grain, not limited to device_name_from_dim_node
.
What if multiple dimension nodes have that particular column name? Do we just fail validation or SQL generation and raise some error indicating an ambiguous value in the metric definition?
Failing seems fine to me. Requiring the user to write the full column name could also be okay, except that it'd be weird for the Transform's column to not be fully qualified, but the Dimension's column is.
Currently, metric expression can only include columns from the Transform node.
Example of why this would be useful:
MAX_BY(user_status, time_column_from_dim_node)
SUM(engagement) FILTER (WHERE device_name_from_dim_node = 'android')