Support "materialize all" for multi-assets in Dagit.

What's the use case?

We have a @multi_asset (def featurize) that produces multiple outputs (imputed_features and features). Both of these outputs are always (unconditionally) materialized from the featurize multi-asset.

We have modeled the asset something like this:

@multi_asset(outs={"imputed_features": AssetOut(), "features": AssetOut()})
def featurize(context):
    yield Output(value=123, output_name="imputed_features")
    yield Output(value=456, output_name="features")

When you load Dagit and go to materialize featurize and both of its outputs, you can't! Instead, you can only see imputed_features and features in the asset catalog. Which means, when you want to materialize featurize, you have to select both imputed_features and features to materialize. Else, Dagit will present you with an error (since both outputs are required).

This is pretty inconvenient for large asset graphs -- you have to know all of the outputs for a multi-asset, and how many may/may not be required, before you can even make a selection and hit materialize.

What we would very much like to be able to do is simply "materialize featurize". In other words, see the multi-asset itself in Dagit in the asset graph, with the outputs linked to it. It would probably look very similar to the way asset groups look. And then be able to materialize all of the outputs together.

subsets

Of note, we can achieve the desired behavior by (not super intuitively) making the asset outputs optional.

@multi_asset(
    outs={
        "imputed_features": AssetOut(is_required=False),
        "features": AssetOut(is_required=False),
    },
    can_subset=True,
)
def featurize(context):
    yield Output(value=123, output_name="imputed_features")
    yield Output(value=456, output_name="features")

In this case, you can materialize one or the other output, and then both outputs will be materialized. However, there are some drawbacks to this approach.

The code no longer accurately models the behavior of the assets. In truth, all of the asset outputs really are required. There will never be a subset of outputs.
All Dagit screens visually represent only the selected asset output as being materialized. For example, if you select imputed_features for materialization, the run display and run history both only include imputed_features, even though features also gets materialized.
The fact that features also gets materialized shows up as a warning in the logs, that an "unexpected asset" is being materialized.

Ideas of implementation

Treat multi-assets more like asset groups in Dagit. Allow the @multi_asset itself to be featurized as a whole.

It's important to have @multi_asset distinct from an asset group, because the code that is generating these outputs cannot be teased out into separate assets. Otherwise we'd use an asset group. But being able to "materialize all" for a @multi_asset, like we can for asset groups, would be ideal.

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

dagster-io / dagster