tensorflow / model-analysis

Model analysis tools for TensorFlow
Apache License 2.0
1.26k stars 281 forks source link

Custom metric with 2d weights is not working #123

Open RazvanPasca opened 3 years ago

RazvanPasca commented 3 years ago

System information

Describe the problem

I have implemented a Personalization metric as a tf.keras.Metric subclass which is used as input for an Evaluator component for a local TFX pipeline. The evaluation step crashes because of some input size mismatch when running the add_weights https://github.com/tensorflow/model-analysis/blob/cc7d75c1bf588123795511572e1b4445d9a52191/tensorflow_model_analysis/metrics/tf_metric_accumulators.py#L110 method in the _CompilableMetricsCombiner class. For other custom metrics that I implemented which compute averages over batches (hence scalars, not 2d) it does not happen.

The training step of the TFX pipeline works correctly and manages to run the metric without problems, hence I think it is not a problem of implementation, but of how tfma handles/wraps custom keras metrics.

I tried to hack it by changing the value of TFMetricsAccumulator._DEFAULT_DESIRED_BATCH_SIZE to 1 and 2 to avoid size mismatches but then the values are completely off (probably some value aggregations do not like it)

Source code / logs

The code for the metric implementation is found here

The traceback for the error is found here Last lines: File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/evaluators/metrics_plots_and_validations_evaluator.py", line 396, in compact return super(_ComputationsCombineFn, self).compact(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 756, in compact return [ File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 757, in <listcomp> c.compact(a, *args, **kwargs) for c, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 568, in compact self._process_batch(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 497, in _process_batch accumulator.add_weights(output_index, metric_index, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_accumulators.py", line 120, in add_weights self._weights[output_index][metric_index] = np.add(cur_weights, weights) ValueError: operands could not be broadcast together with shapes (1,937,30208) (1,73,30208)

RazvanPasca commented 3 years ago

Hello! Any updates on this?