[SL-2184] Conversion metrics do not apply filters to input conversion measures

In working through enabling predicate pushdown, I discovered that conversion metrics only allow filters to be applied to the base measure, and not the conversion measure.

For example, if I have this:

metrics:
  - name: legal_click_conversions
    description: Click through rate, excluding accounts with a > 20% likelihood of being fraudulent
    type: conversion
    type_params:
      conversion_type_params:
        entity: account
        base_measure:
          name: impressions
          filter: "{{ Dimension('account__fraud_score') }} > 0.2"
        conversion_measure:
          name: clicks
          filter: "{{ Dimension('account__fraud_score') }} > 0.2"
        window: 3 days

as a user I think I would reasonably expect that both the base and conversion measures would throw away any input rows where the account__fraud_score was > 0.2.

What we do in practice is apply the filter to the base measure, but throw it away when configuring the conversion measure, which means we will overstate the rate by some arbitrary amount. Just how much depends on how the filter skews the data.

Since filters like these are often applied specifically to deal with non-representative subsets of data we might be grossly overstating conversions in this case.

This is almost certainly done to avoid applying a pre-aggregation time filter on conversions - in this case, we would need to include the following 3 days of conversion events for every base event, but if the time filter is applied directly to conversion events we may lose those inputs. However, it's a blanket obliteration of the filter predicates from both query input and metric definition, and this can produce incorrect results.

Currently, users can work around this sort of filter issue by making a categorical dimension and marking it as a constant property, but this is, of course, fairly clumsy, and the hidden gotcha of throwing away conversion measure filters seems undesirable.

Where things get tricky, however, is in figuring out what to do about a categorical dimension filter passed down from outside. If a user applies the account__fraud_score filter at query time, where should that be applied? What if they write a single filter expression like "{{ Dimension('account__fraud_score') }} > 0.2 AND {{ Dimension('metric_time') }} BETWEEN ...."

We'll likely need to do the full base -> conversion input join along all filter elements as constant properties, apply the filter against that output, and then do the final aggregations as appropriate.

_SL-2184

dbt-labs / metricflow

[SL-2184] Conversion metrics do not apply filters to input conversion measures #1199