sql-machine-learning / elasticdl

Kubernetes-native Deep Learning Framework
https://elasticdl.org
MIT License
732 stars 113 forks source link

Generate the transform code from the SQLFlow statement using code_gen #1686

Open brightcoder01 opened 4 years ago

brightcoder01 commented 4 years ago

The root of the discussion series is #1670 The SQLFlow syntax design of data transform is discussed in #1664

From this SQLFlow statement, we will then generate the transform python code. There are three options for the style of the generated transform code:

  1. Feature Column API. Integrate it with model definition using tf.keras.layers.DenseFeatures;
  2. Customized Keras Layer provided from ElasticDL. The functionality should cover all the common used feature engineering operations above;
  3. Keras Preprocess Layer. This will be ready in TF2.2;
brightcoder01 commented 4 years ago

COLUMN expression -> Parser -> intermediate representative (Transform DAG, Table Schema) -> Generated Python Code.

The parsing result for the COLUMN expression is a DAG, we can see the clause in this sample