mediacloud / sous-chef

Configurable Data Analytics Pipeline
1 stars 0 forks source link

braindump for interface architecture thoughts. #21

Open pgulley opened 2 months ago

pgulley commented 2 months ago

system-metrics is a cool example of how I want to use sous-chef in the future- packaging the actual prefect deployment entrypoints in their own packages which rely on a versioned (or branch-specified) sous-chef package.

The contents of run-recipe.py should be broken out into such a package- probably a sous-chef-s3 or sous-chef-cloudloader package?

It should also be possible to package up new user-interface concerns in such a package- such as methods to deploy sous-chef runs and access their output via the prefect-client-api, and streamlit applications to expose those to and end-user.

It's probably prudent at first to start with simple set-in-place recipes, so an app which just mimics the csv-download functionality might come first, or a simple entity-extractor.

One thing that would be nice to have in this context is better typing/and input prediction for that use case: could we take a given recipe and just generate a pydantic schema describing the inputs it expects? Streamlit can generate a UI from a pydantic schema with the right plugins, so we could have a light developer load on future applications if the infrastructure is easy to settle down.

We could also use the validation steps built into sous-chef to pre-validate input before things are submitted to the cloud service.

I'm imagining that we would want users to provide their own API keys- that's simple enough- but on top of that we'd want some persistent way inside of prefect to make sure that one user can't spam prefect with content. This might be doable within the prefect api if we have some consistent way of tagging flow runs with user ids? Which starts to suggest that we want a wrapper around the prefect-client to standardize these a sous-chef deployment api.

That's kind of the dream in a nutshell, and i think it's got a lot of good iterative development steps. Once I find the time...