The training step of the TFX pipeline works correctly and manages to run the metric without problems, hence I think it is not a problem of implementation, but of how tfma handles/wraps custom keras metrics.
I tried to hack it by changing the value of TFMetricsAccumulator._DEFAULT_DESIRED_BATCH_SIZE to 1 and 2 to avoid size mismatches but then the values are completely off (probably some value aggregations do not like it)
Source code / logs
The code for the metric implementation is found here
The traceback for the error is found here
Last lines:
File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/evaluators/metrics_plots_and_validations_evaluator.py", line 396, in compact return super(_ComputationsCombineFn, self).compact(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 756, in compact return [ File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 757, in <listcomp> c.compact(a, *args, **kwargs) for c, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 568, in compact self._process_batch(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 497, in _process_batch accumulator.add_weights(output_index, metric_index, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_accumulators.py", line 120, in add_weights self._weights[output_index][metric_index] = np.add(cur_weights, weights) ValueError: operands could not be broadcast together with shapes (1,937,30208) (1,73,30208)
System information
Describe the problem
I have implemented a Personalization metric as a tf.keras.Metric subclass which is used as input for an Evaluator component for a local TFX pipeline. The evaluation step crashes because of some input size mismatch when running the
add_weights
https://github.com/tensorflow/model-analysis/blob/cc7d75c1bf588123795511572e1b4445d9a52191/tensorflow_model_analysis/metrics/tf_metric_accumulators.py#L110 method in the_CompilableMetricsCombiner
class. For other custom metrics that I implemented which compute averages over batches (hence scalars, not 2d) it does not happen.The training step of the TFX pipeline works correctly and manages to run the metric without problems, hence I think it is not a problem of implementation, but of how tfma handles/wraps custom keras metrics.
I tried to hack it by changing the value of
TFMetricsAccumulator._DEFAULT_DESIRED_BATCH_SIZE
to 1 and 2 to avoid size mismatches but then the values are completely off (probably some value aggregations do not like it)Source code / logs
The code for the metric implementation is found here
The traceback for the error is found here Last lines:
File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/evaluators/metrics_plots_and_validations_evaluator.py", line 396, in compact return super(_ComputationsCombineFn, self).compact(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 756, in compact return [ File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/apache_beam/transforms/combiners.py", line 757, in <listcomp> c.compact(a, *args, **kwargs) for c, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 568, in compact self._process_batch(accumulator) File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_wrapper.py", line 497, in _process_batch accumulator.add_weights(output_index, metric_index, File "/home/user/.local/share/virtualenvs/project-xv7hiZ8W/lib/python3.8/site-packages/tensorflow_model_analysis/metrics/tf_metric_accumulators.py", line 120, in add_weights self._weights[output_index][metric_index] = np.add(cur_weights, weights) ValueError: operands could not be broadcast together with shapes (1,937,30208) (1,73,30208)