VizierDB / vizier-scala

The Vizier kernel-free notebook programming environment
Other
35 stars 11 forks source link

ML ETL Cell #187

Open okennedy opened 2 years ago

okennedy commented 2 years ago

Spark has some pretty good ETL capabilities built in. It might be fairly easy to add a cell or similar construct that lets us use them to build ETL pipelines.

https://spark.apache.org/docs/latest/ml-features.html

Side note: Would it be worth reframing Vizier Dataset artifact objects as spark transformers?

okennedy commented 1 year ago

This should actually be a purely UI-based challenge, now that we support Transformers as Vizier datasets.