Closed ammarchalifah closed 1 year ago
I've found the solution, i.e. re-implement it with minor change. Here's the practical solution:
class QuadraticCohenKappa(tf.keras.metrics.Metric):
def __init__(self, num_classes, name="cohen_kappa", **kwargs):
super(QuadraticCohenKappa, self).__init__(name=name, **kwargs)
self.num_classes = num_classes
self.conf_mtx = self.add_weight(
"conf_mtx",
shape=(self.num_classes, self.num_classes),
initializer=tf.keras.initializers.zeros,
dtype=tf.float32,
)
def _safe_squeeze(self, y):
y = tf.squeeze(y)
# Check for scalar result
if tf.rank(y) == 0:
y = tf.expand_dims(y, 0)
return y
def update_state(self, y_true, y_pred, sample_weight=None):
y_pred = tf.reshape(tf.math.round(tf.clip_by_value(y_pred, 0, 4)), shape=(-1, 1))
y_pred = tf.cast(y_pred, "int32")
y_true = tf.cast(y_true, "int32")
y_pred = self._safe_squeeze(y_pred)
y_true = self._safe_squeeze(y_true)
new_conf_mtx = tf.math.confusion_matrix(
labels=y_true,
predictions=y_pred,
num_classes=self.num_classes,
weights=sample_weight,
dtype=tf.float32,
)
self.conf_mtx.assign_add(new_conf_mtx)
def result(self):
nb_ratings = tf.shape(self.conf_mtx)[0]
weight_mtx = tf.ones([nb_ratings, nb_ratings], dtype=tf.float32)
weight_mtx += tf.cast(tf.range(nb_ratings), dtype=tf.float32)
weight_mtx = tf.cast(weight_mtx, dtype=self.dtype)
weight_mtx = tf.pow((weight_mtx - tf.transpose(weight_mtx)), 2)
weight_mtx = tf.cast(weight_mtx, dtype=self.dtype)
# 3. Get counts
actual_ratings_hist = tf.reduce_sum(self.conf_mtx, axis=1)
pred_ratings_hist = tf.reduce_sum(self.conf_mtx, axis=0)
# 4. Get the outer product
out_prod = pred_ratings_hist[..., None] * actual_ratings_hist[None, ...]
# 5. Normalize the confusion matrix and outer product
conf_mtx = self.conf_mtx / tf.reduce_sum(self.conf_mtx)
out_prod = out_prod / tf.reduce_sum(out_prod)
conf_mtx = tf.cast(conf_mtx, dtype=self.dtype)
out_prod = tf.cast(out_prod, dtype=self.dtype)
# 6. Calculate Kappa score
numerator = tf.reduce_sum(conf_mtx * weight_mtx)
denominator = tf.reduce_sum(out_prod * weight_mtx)
return tf.cond(
tf.math.is_nan(denominator),
true_fn=lambda: 0.0,
false_fn=lambda: 1 - (numerator / denominator),
)
def reset_states(self):
# The state of the metric will be reset at the start of each epoch.
for v in self.variables:
K.set_value(
v,
np.zeros((self.num_classes, self.num_classes), v.dtype.as_numpy_dtype),
)
Feel free to point out if there's something wrong with this implementation. Thanks
Thanks @ammarchalifah for pointing it out. Can you submit a PR for adding support for tf.data
to the existing codebase?
Sure, I'll work on it.
Quick update: after I tried to explore the code and trying to look on what to fix, I found out that if we pass True
to sparse_labels argument in CohenKappa, then it can be directly used on tf.data
API. The point is, my initial issue was invalid.
TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision: TensorFlow Addons Wind Down
Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA: Keras Keras-CV Keras-NLP
Hello TF Addons Community!
I need to use CohenKappa as the main metric in my research. However I can't use CohenKappa directly because my model is a regression model, so first I need to clip and round the prediction result. I tried to make customized metric with super function.
Then, I tried to test this implementation by using this simple test, and it works fine!
that has the output shown below
Based on this simple test, I initially thought that my implementation is correct and ready to do the job. Then, I use this custom metric on my model training. For performance optimization, I use the tf Dataset object as the data generator. Then, I trained it.
The training's output looks like this
Here's the issue: the custom MAE works just fine, however my custom cohen_kappa results in nan for every epoch! I kinda confident that my implementation is correct based on the previous test, however, I think the problem lies in the conversion between tf objects. I find that MAE's implementation involves complex data transformation and currently too advance for my skill. So, if you guys can't add the metrics API to support tf.Dataset, may you help me to find any workaround for this?
Thanks!
Relevant information