Matthews correlation coefficient for multi-class

ZeynepP commented 3 years ago

HI to all,

I am trying to interpret different results I am getting in tensorflow for MCC. I have 3 classes and I am using categorical_crossentropy as a loss function. I added mcc in metrics from tensoflow addons as follows:

tfa.metrics.MatthewsCorrelationCoefficient(name="mcc", num_classes=3)
...
test_metrics = model.evaluate(test_gen, verbose=1)
test_mcc = test_metrics[model.metrics_names.index("mcc")]

# THIS PART IS ADDED TO TEST DIFF BETWEEN EVAL AND PREDICT
y_classes = test_gen.targets.argmax(axis=-1)
pred_prop = model.predict(test_gen, verbose=1)
test_mcc_sklearn = matthews_corrcoef(y_classes, pred_prop.argmax(axis=-1))

Training output (ex for one epoch) : ... - loss: 0.7549 - acc: 0.7276 - auc: 0.8313 - mcc: 0.3735 - val_loss: 0.7677 - val_acc: 0.7302 - val_auc: 0.8211 - val_mcc: 0.3745... Evaluate output : ... loss: 0.7879 - acc: 0.7124 - auc: 0.8107 - mcc: 0.3768 ... test_mcc = [0.24070711 0.29297936 0.5968444 ] test_mcc_sklearn = 0.04130715666673915

It is the first time I am using mcc and I am really confused with this different results !?? What am I missing? Thanks in advance.

ZeynepP commented 3 years ago

Very basic example :

https://colab.research.google.com/drive/1dErU9wesGrXC7ILJEfpocF9zK5laRfX7?usp=sharing

Bests

jonpsy commented 3 years ago

Hi! May I take this up?

ZeynepP commented 3 years ago

I wish someone does...

jonpsy commented 3 years ago

I think our current implementation is wrong, it uses one vs all for getting Matthew's coefficient for each class. Instead, it should've used the generalized Matthew's Formula(for Multi & binary) as is provided here: https://en.wikipedia.org/wiki/Matthews_correlation_coefficient#Multiclass_case. The same is implemented in sklearn.

Have I gathered the context right @autoih @marload?

ZeynepP commented 3 years ago

I debug the tfa.mcc and sklearn mcc. As @jonpsy mentioned the formula implemented in the actual version is not correct. I updated it following to wiki and sklearn code :

def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.cast(y_true, dtype=self.dtype)
        y_pred = tf.cast(y_pred, dtype=self.dtype)

        C = tf.math.confusion_matrix(
            labels=tf.argmax(y_true, 1),
            predictions=tf.argmax(y_pred, 1),
            num_classes=self.num_classes,
            weights=sample_weight,
            dtype=self.dtype,
        )

        t_sum = tf.reduce_sum(C, axis=1)
        p_sum = tf.reduce_sum(C, axis=0)

        n_correct = tf.linalg.trace(C)
        n_samples = tf.reduce_sum(p_sum)

        cov_ytyp = n_correct * n_samples - tf.tensordot(t_sum, p_sum, axes=1)
        cov_ypyp = n_samples ** 2 - tf.tensordot(p_sum, p_sum, axes=1)
        cov_ytyt = n_samples ** 2 - tf.tensordot(t_sum, t_sum, axes=1)
        self.mcc = cov_ytyp / tf.math.sqrt(cov_ytyt * cov_ypyp)
        if tf.math.is_nan(self.mcc ) :
            self.mcc = tf.constant(0, dtype=self.dtype)

def result(self):
       return self.mcc

I am getting the same result as sklean with this implementation.

FYI. Bests

jonpsy commented 3 years ago

I was meaning to implement this myself but since nobody was replying I reconsidered if I was permitted to do it(and whether you were interested still) . Regardless, I'm happy to see your question answered :)

WindQAQ commented 3 years ago

Hi all, I am sorry I have just noticed this thread. It's more than welcome to submit a PR for it! Thanks!

jonpsy commented 3 years ago

@WindQAQ I was wondering, how about we keep them both? That is, Matthew correlation in one vs all method & the actual method using a boolean switch?

aminzabardast commented 3 years ago

Hi, Everyone. Since we have this fix in the merge queue, I was wondering if the stable version's (0.12.1) MCC for Multiclass is implemented correctly or not. Shall I refrain from using it?

jonpsy commented 3 years ago

Hi, Everyone. Since we have this fix in the merge queue, I was wondering if the stable version's (0.12.1) MCC for Multiclass is implemented correctly or not. Shall I refrain from using it?

It's not wrong per sé, it gets the coefficients for each of the class, so in a sense it is "multi-class".

In a general sense though, the implementation is "wrong". Personally, I'd refrain from using it.

aminzabardast commented 3 years ago

@jonpsy Thanks.

pedrogalher commented 3 years ago

@jonpsy Hi, I think it would be great to have both methods: MCC per class and the MCC for the model. I work as a scientist and sometimes I find it useful to have the segmentation performance of a model with respect to the classes I want to segment rather than the overall model performance. The are some reseach in which they use the MCC per class like this one or this one.

syedRumman commented 2 years ago

Hi I was wondering if this issue was resolved for multiclass?

lukaszdz commented 1 year ago

When I tried to utilize MCC in my LSTM, it was always returning zero. Model definition:

    model = Sequential()
    model.add(Embedding(voc_size, embedding_vector_features, input_length=25))
    model.add(Dropout(0.4))  # reduce overfitting
    model.add(LSTM(100))
    model.add(Dropout(0.4))
    model.add(Dense(1, activation="sigmoid"))
    model.compile(
        loss="binary_crossentropy",
        optimizer="adam",
        metrics=[MatthewsCorrelationCoefficient2(num_classes=2)],
        run_eagerly=True,
    )

I had to modify the source code in MCC to get the metric to report a nonzero number for training and validation:

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.cast(y_true, dtype=self.dtype)
        y_true_2 = tf.constant(np.squeeze(y_true.numpy()))
        y_pred = tf.cast(y_pred, dtype=self.dtype)
        y_pred_2 = tf.constant(
            [[1] if a[0] > 0.5 else [0] for a in y_pred.numpy()]
        )

        new_conf_mtx = tf.math.confusion_matrix(
            # labels=tf.argmax(y_true, 1),
            # predictions=tf.argmax(y_pred, 1),
            labels=y_true_2,
            predictions=y_pred_2,
            num_classes=self.num_classes,
            weights=sample_weight,
            dtype=self.dtype,
        )

        self.conf_mtx.assign_add(new_conf_mtx)

Not sure if this is related but hopefully it helps someone.

seanpmorgan commented 1 year ago

TensorFlow Addons is transitioning to a minimal maintenance and release mode. New features will not be added to this repository. For more information, please see our public messaging on this decision: TensorFlow Addons Wind Down

Please consider sending feature requests / contributions to other repositories in the TF community with a similar charters to TFA: Keras Keras-CV Keras-NLP

tensorflow / addons

Matthews correlation coefficient for multi-class #2339