Open typhoonzero opened 4 years ago
I mostly agree with this idea. As for now, we only use a very basic operation of couler. And the couler generation process is more like copy the code to the output YAML. However, we need to make sure we will not use the more advanced features, like condition, loop, and other things. We don't want to implement the whole suite of logic in couler in Go.
- We need a local mode to simplify the development and debugging. As a compiler, SQLFlow local mode can directly generate a Python program with several step functions and call them one by one, and with the "workflow mode", SQLFlow can generate the YAML with the step functions directly.
I think that for local step, we will call docker api to launch a container on our Laptop instead of calling the step functions built upon the runtime library directly.
Because models and runnables are released in customized docker images. We need launch a container using these images to execute them.
We decide to rename Couler to flow
.
SQLFlow compiles a SQL program into a workflow program, this workflow program should be able to run on different runtime environments like running in local with Docker, on Kubernetes with Argo or Tekton.
In the refactored code, we use Couler to generate an Argo YAML file to submit to the Kubernetes cluster to run. A generated Couler program should look like:
Execute the above Couler program, it should generate a YAML file with the above step python code in it.
Use Couler to generate YAML is hard to maintain. We use Couler to generate and submit the workflow yet, we still use Go to Fetch the workflow status periodically; The SQLFlow compiler needs to maintain a Go side workflow struct in order to do dependency analysis and other optimizations ( e.g. katib?), there's no need to translate it to Python side, and use Python Couler to implement YAML generation again.
We need a local mode to simplify the development and debugging. As a compiler, SQLFlow local mode can directly generate a Python program with several step functions and call them one by one, and with the "workflow mode", SQLFlow can generate the YAML with the step functions directly.