A common use-case for the papermill plugin could be automatically generating notebook reports at the end of modeling / etl pipelines. Having a notebook is useful because a Data Scientist can download it and inspect results further. The alternative is much harder.
Problem
Unfortunately, Papermill's inputs are limited, which makes it difficult to get data and files from Flyte into the notebooks. It requires a few extra tasks.
Idea
Providing helper functions that work with common data types like StructuredDataset, FlyteDirectory, and FlyteFiles.
Use Case
A common use-case for the papermill plugin could be automatically generating notebook reports at the end of modeling / etl pipelines. Having a notebook is useful because a Data Scientist can download it and inspect results further. The alternative is much harder.
Problem
Unfortunately, Papermill's inputs are limited, which makes it difficult to get data and files from Flyte into the notebooks. It requires a few extra tasks.
Idea
Providing helper functions that work with common data types like StructuredDataset, FlyteDirectory, and FlyteFiles.
It could be an inputs version of
record_outputs
, although serializing into json might difficult. See https://github.com/flyteorg/flytekit/blob/master/plugins/flytekit-papermill/flytekitplugins/papermill/task.py#L294.