deep-diver / semantic-segmentation-ml-pipeline

Machine Learning Pipeline for Semantic Segmentation with TensorFlow Extended (TFX) and various GCP products
https://blog.tensorflow.org/2023/01/end-to-end-pipeline-for-segmentation-tfx-google-cloud-hugging-face.html
Apache License 2.0
93 stars 20 forks source link

Add Evaluator component to the pipeline #21

Closed deep-diver closed 2 years ago

deep-diver commented 2 years ago

Trained model evaluation is essential part of the ML Pipeline. TFX comes with Evaluator standard component. In theory, we should compare currently shipped/deployed model to a trained model from the current pipeline run.

AFAIK, Evaluator is in charge of marking a model blessed or not blessed. Any models produced by pipelines are stored as artifact, and their metadata is stored in traditional RDB. So, Resolver node with LatestBlessedModelStrategy can retrieve the latest blessed model from there, and a blessed model doesn't have to be actually deployed/pushed.

Hence, we can simply include Evaluator into the pipeline even if we have 🤗 related pusher components at the end. In order to make this work, the followings should be handled:

Happy to discuss more on this @sayakpaul

deep-diver commented 2 years ago

Found an alternate solution for the part of defining Evaluator 's metrics : there is a special type of SavedModel called EvalSavedModel

Turns out it only supports estimator currently.

deep-diver commented 2 years ago

I found a working solution, and I will add it to the next PR. There are two steps

  1. the model.py should be modified like below. two additional signatures:
    
    + def _get_transform_features_signature(model, tf_transform_output):
    +     model.tft_layer = tf_transform_output.transform_features_layer()
    + 
    +     @tf.function(
    +         input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name="examples")]
    +     )
    +     def serve_tf_examples_fn(serialized_tf_examples):
    +         feature_spec = tf_transform_output.raw_feature_spec()
    +         parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
    + 
    +         transformed_features = model.tft_layer(parsed_features)
    + 
    +         return transformed_features
    +         
    +     return serve_tf_examples_fn

def run_fn(fn_args: FnArgs): ... ... model.save( fn_args.serving_model_dir, save_format="tf", signatures={ "serving_default": _model_exporter(model),

  1. then, write Evaluator like below:
    
    eval_config = tfma.EvalConfig(
    model_specs=[
        tfma.ModelSpec(
            signature_name='from_examples',
            preprocessing_function_names=['transform_features'],
            label_key="label_xf",
            prediction_key="label_xf"
        )
    ],
    slicing_specs=[tfma.SlicingSpec()],
    metrics_specs=[
        tfma.MetricsSpec(
            metrics=[
                tfma.MetricConfig(
                    class_name="SparseCategoricalAccuracy",
                    threshold=tfma.MetricThreshold(
                        value_threshold=tfma.GenericValueThreshold(
                            lower_bound={"value": 0.55}
                        ),
                        # Change threshold will be ignored if there is no
                        # baseline model resolved from MLMD (first run).
                        change_threshold=tfma.GenericChangeThreshold(
                            direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                            absolute={"value": -1e-3},
                        ),
                    ),
                )
            ]
        )
    ],
    )

evaluator = Evaluator( examples=example_gen.outputs["examples"], model=trainer.outputs["model"], baseline_model=model_resolver.outputs["model"], eval_config=eval_config, )