sql-machine-learning / elasticdl

Kubernetes-native Deep Learning Framework
https://elasticdl.org
MIT License
732 stars 113 forks source link

Should we build docker image each time while executing periodic ElasticDL training in SQLFlow pipeline? #1880

Open brightcoder01 opened 4 years ago

brightcoder01 commented 4 years ago

For ElasticDL training, the model python file in the image should be complete. It contains both the model main structure and the feature engineering code (using feature column / keras preprocessing layers).

From the model zoo, we can get the docker image containing the Python file for model main structure. The feature engineering code is auto generated from the COLUMN expression and the data analysis result.

For the periodic training, the training pipeline described using a SQL statement will be executed regularly on the source table of different date partition. The pipeline contains the steps of data analysis, transform code_gen and training execution. Since the data analysis results (hash_bucket_size, bucketize_boundaries, min/max) for different partition of data are different, the generated transform code will be different, and then the complete model python file in the ElasticDL training container are different.

If we build a new image each time the daily training pipeline is executed, the image list will be huge. It's storage consuming.

workingloong commented 4 years ago

Can we write the those analysis results into a file? Then we read those results to build the model in the container using the same image.

skydoorkai commented 4 years ago

Can we write the those analysis results into a file? Then we read those results to build the model in the container using the same image.

Where to store this file? SQLFlow provides shared storage?