Chapter 7 - TFMA Evaluator AUC Metric Case Mismatch

mshearer0 commented 3 years ago

Interactive pipeline sets a threshold for 'AUC' but metric produced is 'auc' resulting in tfma.load_validation_result messages: "Metric not found." overall and for all products slices

Correcting to:

thresholds={ 'auc':

produces correct validation failures as for several products (overall passes as >0.65 threshold) as below.

However evaluator.outputs['blessing'].get()[0].uri is NOT_BLESSED:

metric_validations_per_slice { slice_key { single_slice_keys { column: "product" bytes_value: "Consumer Loan" } } failures { metric_key { name: "auc" } metric_threshold { value_threshold { lower_bound { value: 0.65 } } } metric_value { double_value { value: 0.6262196898460388 } } } } metric_validations_per_slice { slice_key { single_slice_keys { column: "product" bytes_value: "Mortgage" } } failures { metric_key { name: "auc" } metric_threshold { value_threshold { lower_bound { value: 0.65 } } } metric_value { double_value { value: 0.618944525718689 } } } } metric_validations_per_slice { slice_key { single_slice_keys { column: "product" bytes_value: "Payday loan" } } failures { metric_key { name: "auc" } metric_threshold { value_threshold { lower_bound { value: 0.65 } } } metric_value { double_value { value: 0.6383864879608154 } } } } metric_validations_per_slice { slice_key { single_slice_keys { column: "product" bytes_value: "Student loan" } } failures { metric_key { name: "auc" } metric_threshold { value_threshold { lower_bound { value: 0.65 } } } metric_value { double_value { value: 0.6052306294441223 } } } }

mshearer0 commented 3 years ago

Setting auc threshold to 0.6 produces:

validation_ok: true

{evaluator.outputs['blessing'].get()[0].uri} = BLESSED

catherinenelson1 commented 3 years ago

Hi @mshearer0,

Thank you for spotting the change from 'AUC' to 'auc', I'll look into it and fix it if I can.

I'm a little confused by the rest of your issue, could you explain further what the problem is? What is the overall auc for your model? Are you expecting it to pass or fail the validation?

mshearer0 commented 3 years ago

Hi @drcat101 ,

Changing the threshold from AUC to auc resolves my problem. Just sharing my results to illustrate that several slices fail the 0.65 threshold but lowering to 0.6 they all pass and generate the validation_ok = true message in the book.

Building-ML-Pipelines / building-machine-learning-pipelines

Chapter 7 - TFMA Evaluator AUC Metric Case Mismatch #22