Open zzing0907 opened 1 year ago
@zzing0907,
Could you please provide the minimum reproducible code to reproduce the error at our end? Please refer Tensorflow Model Analysis Metrics and Plots for TFMA supported metrics and Ranking based metrics. Thank you!
@zzing0907 @singhniraj08 I am also facing similar issue. I am finding that this error comes and leaves, thus not quite reproducible. However, it occurs with high enough frequency to be concerning. I would also like to point out that the monitoring metric
referred by the error message is not the same as Evaluation metrics
in Data Science as referred by TFMA. The monitoring metrics
are created by Apache Beam to check the progress of the workers (likely). So I wonder whether or not this is actually an issue with Apache Beam. I have filed a similar issue with the Apache Beam team here: https://github.com/apache/beam/issues/27469
Coming at it from the Beam metric side, it looks like numpy.int64 values are being passed to the counter improperly somewhere. Those counters should only receive ints, as that is the only type that the Beam code will encode before passing it to a protobuffer to be reported. I provided a little context on https://github.com/apache/beam/issues/27469. If you can find where the metric is getting numpy.int64s and convert the values to python ints in the call, that should resolve it.
System information
Describe the problem
While TFMA using beam metric, an error occurs because the type is numpy.int64 rather than int. The error log is as follows, and an error occurs when running evaluation with padding option (tf-ranking metrics). It seems that the above error occurs while obtaining batch_size from the metric called
num_instances
.Source code / logs