Add Evaluator component to the pipeline

deep-diver commented 2 years ago

Trained model evaluation is essential part of the ML Pipeline. TFX comes with Evaluator standard component. In theory, we should compare currently shipped/deployed model to a trained model from the current pipeline run.

AFAIK, Evaluator is in charge of marking a model blessed or not blessed. Any models produced by pipelines are stored as artifact, and their metadata is stored in traditional RDB. So, Resolver node with LatestBlessedModelStrategy can retrieve the latest blessed model from there, and a blessed model doesn't have to be actually deployed/pushed.

Hence, we can simply include Evaluator into the pipeline even if we have 🤗 related pusher components at the end. In order to make this work, the followings should be handled:

[X] include Resolver node in the pipeline
[X] write eval_config with appropriate model performance comparison criteria for the Evaluator
[X] include Evaluator component in the pipeline
[X] modify model.py to have extra signature for preprocessing + add the signature in the preprocessing_function_names in the model_spec of eval_config. Since data to the Evaluator is raw and not transformed, we should let it be transformed appropriately.

Happy to discuss more on this @sayakpaul

deep-diver commented 2 years ago

Found an alternate solution for the part of defining Evaluator 's metrics : there is a special type of SavedModel called EvalSavedModel

Turns out it only supports estimator currently.

deep-diver commented 2 years ago

I found a working solution, and I will add it to the next PR. There are two steps

the model.py should be modified like below. two additional signatures:


+ def _get_transform_features_signature(model, tf_transform_output):
+     model.tft_layer = tf_transform_output.transform_features_layer()
+ 
+     @tf.function(
+         input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name="examples")]
+     )
+     def serve_tf_examples_fn(serialized_tf_examples):
+         feature_spec = tf_transform_output.raw_feature_spec()
+         parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
+ 
+         transformed_features = model.tft_layer(parsed_features)
+ 
+         return transformed_features
+         
+     return serve_tf_examples_fn

def _get_tf_examples_serving_signature(model, tf_transform_output):
model.tft_layer = tf_transform_output.transform_features_layer()
@tf.function(
input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name="examples")]
)
def serve_tf_examples_fn(
serialized_tf_example: tf.Tensor,
) -> Dict[str, tf.Tensor]:
raw_feature_spec = tf_transform_output.raw_feature_spec()
raw_features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)
transformed_features = model.tft_layer(raw_features)
outputs = model(transformed_features)
return {
_transformed_name(_LABEL_KEY): outputs
}
return serve_tf_examples_fn

def run_fn(fn_args: FnArgs): ... ... model.save( fn_args.serving_model_dir, save_format="tf", signatures={ "serving_default": _model_exporter(model),

"transform_features": _get_transform_features_signature(
model, tf_transform_output
),
"from_examples": _get_tf_examples_serving_signature(
model, tf_transform_output
),
} )

then, write Evaluator like below:


eval_config = tfma.EvalConfig(
model_specs=[
    tfma.ModelSpec(
        signature_name='from_examples',
        preprocessing_function_names=['transform_features'],
        label_key="label_xf",
        prediction_key="label_xf"
    )
],
slicing_specs=[tfma.SlicingSpec()],
metrics_specs=[
    tfma.MetricsSpec(
        metrics=[
            tfma.MetricConfig(
                class_name="SparseCategoricalAccuracy",
                threshold=tfma.MetricThreshold(
                    value_threshold=tfma.GenericValueThreshold(
                        lower_bound={"value": 0.55}
                    ),
                    # Change threshold will be ignored if there is no
                    # baseline model resolved from MLMD (first run).
                    change_threshold=tfma.GenericChangeThreshold(
                        direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                        absolute={"value": -1e-3},
                    ),
                ),
            )
        ]
    )
],
)

evaluator = Evaluator( examples=example_gen.outputs["examples"], model=trainer.outputs["model"], baseline_model=model_resolver.outputs["model"], eval_config=eval_config, )

deep-diver / semantic-segmentation-ml-pipeline

Add Evaluator component to the pipeline #21