Closed deep-diver closed 2 years ago
Found an alternate solution for the part of defining Evaluator
's metrics
: there is a special type of SavedModel called EvalSavedModel
Turns out it only supports estimator
currently.
I found a working solution, and I will add it to the next PR. There are two steps
model.py
should be modified like below. two additional signatures:
+ def _get_transform_features_signature(model, tf_transform_output):
+ model.tft_layer = tf_transform_output.transform_features_layer()
+
+ @tf.function(
+ input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string, name="examples")]
+ )
+ def serve_tf_examples_fn(serialized_tf_examples):
+ feature_spec = tf_transform_output.raw_feature_spec()
+ parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
+
+ transformed_features = model.tft_layer(parsed_features)
+
+ return transformed_features
+
+ return serve_tf_examples_fn
def run_fn(fn_args: FnArgs): ... ... model.save( fn_args.serving_model_dir, save_format="tf", signatures={ "serving_default": _model_exporter(model),
eval_config = tfma.EvalConfig(
model_specs=[
tfma.ModelSpec(
signature_name='from_examples',
preprocessing_function_names=['transform_features'],
label_key="label_xf",
prediction_key="label_xf"
)
],
slicing_specs=[tfma.SlicingSpec()],
metrics_specs=[
tfma.MetricsSpec(
metrics=[
tfma.MetricConfig(
class_name="SparseCategoricalAccuracy",
threshold=tfma.MetricThreshold(
value_threshold=tfma.GenericValueThreshold(
lower_bound={"value": 0.55}
),
# Change threshold will be ignored if there is no
# baseline model resolved from MLMD (first run).
change_threshold=tfma.GenericChangeThreshold(
direction=tfma.MetricDirection.HIGHER_IS_BETTER,
absolute={"value": -1e-3},
),
),
)
]
)
],
)
evaluator = Evaluator( examples=example_gen.outputs["examples"], model=trainer.outputs["model"], baseline_model=model_resolver.outputs["model"], eval_config=eval_config, )
Trained model evaluation is essential part of the ML Pipeline. TFX comes with
Evaluator
standard component. In theory, we should compare currently shipped/deployed model to a trained model from the current pipeline run.AFAIK,
Evaluator
is in charge of marking a modelblessed
ornot blessed
. Any models produced by pipelines are stored as artifact, and their metadata is stored in traditional RDB. So,Resolver
node withLatestBlessedModelStrategy
can retrieve the latest blessed model from there, and a blessed model doesn't have to be actually deployed/pushed.Hence, we can simply include
Evaluator
into the pipeline even if we have 🤗 related pusher components at the end. In order to make this work, the followings should be handled:Resolver
node in the pipelineeval_config
with appropriate model performance comparison criteria for theEvaluator
Evaluator
component in the pipelinemodel.py
to have extra signature for preprocessing + add the signature in thepreprocessing_function_names
in themodel_spec
ofeval_config
. Since data to theEvaluator
is raw and not transformed, we should let it be transformed appropriately.Happy to discuss more on this @sayakpaul