Open sryza opened 2 years ago
This would be really useful. For what it's worth, the "natural" place for me to have this config would be in the source asset, i.e. if I could do the following:
entities = SourceAsset("entities", config_schema={"url": str})
If this config was then available to the IOManager, I could have e.g. a production run read from S3:
assets: # Or perhaps separated as source_assets
entities:
config:
url: s3://mybucket/path/to/entities.csv
or a "test" run using a file on the local filesystem:
assets:
entities:
config:
url: path/to/entities.csv
+1
@sryza I see this is stale but would love to have this available.
I can do this with ops
via the type system (with a custom type and loader) however this functionality is missing for assets.
The best I can think of to workaround this is to have a root asset (not a source asset) that has a config schema which provides all available locations for data loading.
The problem with that approach is that feels like an anti-pattern. I would much prefer to have the input logic handled by the IO manager in that layer instead of implementing that logic within the root asset body
IO managers currently support
input_config_schema
andoutput_config_schema
.output_config_schema
allows providing configuration that dictates:Analogous "how it's loaded in all the downstream places it's used" config might also be useful for source assets.
This would be useful in a situation where you want to kick off a run that targets a particular file that you decide at runtime.