YanCote / IFT6268-simclr

Project for IFT6268
0 stars 0 forks source link

I added AUC score and added sigmoid to the logits #36

Closed sgaut023 closed 3 years ago

sgaut023 commented 3 years ago

1- AUC SCORE : #30 I added a new metric: multi-label AUC score. Each label has its AUC score. I tried to use tf.keras.metrics.AUC, but I was not able to get the desired AUC score.

I decided to use the sklearn roc_auc_score. However, I discovered an issue. The function roc_auc_score can result in an error (ValueError: Only one class present in y_true. ROC AUC score is not defined in that). The error occurred when each label (for one class) has only one value in the batch. For example, if all the samples in the batch has hernia +1, the error will occurred. For example, this will create in an error: roc_auc_score([[1,0,1], [1,0,1]], [[1,0.1,1], [0.9,0,0.8]]). Let's say the first index is associated with Hernia and there are 2 instances in the batch. Since all the instances in the batch has only the label '1' for Hernia, sklearn will generate the error.

That being said, I decided to keep track in a list of all the labels and logits for ONE epoch. At the end on the epoch, the AUC is computed.

2- MLFLOW : #8 I added the code at the end of the training to log the metrics and params in mlflow. For now, I only log the training metrics. The next step is to compute the metrics on the validation set et log them in mlflow.

3- Sigmoid Logits : @YanCote I applied sigmoid to the logits to make sure logits are between 0 and 1. Can you validate that this is the behaviour that we are expected. In fact, when we are making the predictions, we are using a threshold (0.5). Thus, I suppose we need to make sure that the logits are between 0 and 1.

4- Corrected Loss : #29 After my discussion with Yan, we realized that there were a bug in the loss function. We need to use neg_abs_logits instead of -x. The change is done in the PR.