sql-machine-learning / elasticdl

Kubernetes-native Deep Learning Framework
https://elasticdl.org
MIT License
731 stars 113 forks source link

Custom model changes needed to unify with SQLFlow model zoo #1476

Closed typhoonzero closed 4 years ago

typhoonzero commented 4 years ago

Background

Unify model zoo implementation of SQLFlow and ElasticDL: https://github.com/sql-machine-learning/models/issues/22 WIP PR: https://github.com/sql-machine-learning/models/pull/27

Custom Model Requirements for Unifying ModelZoo

  1. Support feature_columns argument when initializing a model.
  2. Support a default eval_metrics_fn, so that this function is not "required" when writing a custom model definition.
  3. Current loss(output, labels) function can not be reused in kerasmodel.compile, should be compatible with keras loss functions, like:keras.losses.mean_squared_error(y_true, y_pred): https://keras.io/losses/
  4. dataset_fn is still needed when reading data from MaxCompute: https://github.com/sql-machine-learning/elasticdl/blob/develop/model_zoo/odps_iris_dnn_model/odps_iris_dnn_model.py
terrytangyuan commented 4 years ago
  1. We already support this. Users just need to add feature columns definitions as part of their model definition. Something like the following would work. See tutorial in https://www.tensorflow.org/tutorials/structured_data/feature_columns for details.
    feature_layer = tf.keras.layers.DenseFeatures(feature_columns)
    model = tf.keras.Sequential([
    feature_layer,
    layers.Dense(128, activation='relu'),
    layers.Dense(128, activation='relu'),
    layers.Dense(1, activation='sigmoid')
    ])
  2. What would be a good default for eval_metrics_fn? I think this is highly dependent on the type of model that's defined, e.g. regression vs. classification.
  3. I thought it should be compatible already. @LiMinghao1994 @workingloong @brightcoder01 can take a look at this if that's not the case.
  4. It's not needed if you are using SQLFLow directly since dataset_fn is automatically generated by ElasticDL codegen.
typhoonzero commented 4 years ago

What would be a good default for eval_metrics_fn? I think this is highly dependent on the type of model that's defined, e.g. regression vs. classification.

Well, I'll consider about this. It's true that the eval_metrics_fn is needed for every model.

It's not needed if you are using SQLFLow directly since dataset_fn is automatically generated by ElasticDL codegen.

Is it possible to test a model using MaxCompute as the dataset, so that the dataset_fn is automatically generated by ElasticDL, so that if we need to test a model if it's working, we don't need to write a dataset_fn again.

terrytangyuan commented 4 years ago

Is it possible to test a model using MaxCompute as the dataset, so that the dataset_fn is automatically generated by ElasticDL, so that if we need to test a model if it's working, we don't need to write a dataset_fn again.

I think it should be auto generated by SQFLow instead since ElasticDL's model definition must contain information like feature column names and label column name, which should be written by the user as part of dataset_fn. This kind of information can be easier to obtain through SQLFlow's extended query.

typhoonzero commented 4 years ago

@terrytangyuan The problem is if we want to involve many model developers to contribute models, the dataset_fn should not be a part of the model. It may be a function used to test the model is working, but not in the model's definition file.

typhoonzero commented 4 years ago

I thought it should be compatible already. @LiMinghao1994 @workingloong @brightcoder01 can take a look at this if that's not the case.

The order of output and labels argument should be the same as Keras's loss functions.

terrytangyuan commented 4 years ago

@typhoonzero and I synced offline. To summarize, we will:

  1. Automatically create dataset_fn() internally in ElasticDL so when ODPS data source is used this will be generated automatically without having to implement it in model definition file.
  2. Allow specifying feature col names and label col name through ElasticDL, e.g. --data_reader_params or --envs.
  3. We start with supporting Keras subclass API first. Functional Keras API requires the use of tf.keras.layers.Input which needs to know the input shape, which is not easy to support since dataset_fn and model is decoupled now.
terrytangyuan commented 4 years ago

I thought it should be compatible already. @LiMinghao1994 @workingloong @brightcoder01 can take a look at this if that's not the case.

The order of output and labels argument should be the same as Keras's loss functions.

@typhoonzero This should be fixed by #1490. Please test to see if it works now.

terrytangyuan commented 4 years ago

@typhoonzero This can be closed now, right?