paris-saclay-cds / ramp-workflow

Toolkit for building predictive workflows on top of pydata (pandas, scikit-learn, pytorch, keras, etc.).
https://paris-saclay-cds.github.io/ramp-docs/
BSD 3-Clause "New" or "Revised" License
68 stars 43 forks source link

Redundancy in soft accuracy score #229

Closed lucyleeow closed 4 years ago

lucyleeow commented 4 years ago

In SoftAccuracy what is the purpose of y_proba = np.clip(y_proba, 0, 1) in:

    def __call__(self, y_true_proba, y_proba):
        # Clip negative probas
        y_proba_positive = np.clip(y_proba, 0, 1)
        # Normalize rows
        y_proba = np.clip(y_proba, 0, 1)
        y_proba_normalized = y_proba_positive / np.sum(
            y_proba_positive, axis=1, keepdims=True)
        # Smooth true probabilities with score_matrix
        y_true_smoothed = y_true_proba.dot(self.score_matrix)
        # Compute dot product between the predicted probabilities and
        # the smoothed true "probabilites" ("" because it does not sum to 1)
        scores = np.sum(y_proba_normalized * y_true_smoothed, axis=1)
        scores = np.nan_to_num(scores)
        score = np.mean(scores)
        # to pick up all zero probabilities
        score = np.nan_to_num(score)
        return score

It appears to define the same thing as y_proba_positive and it is not used later in the function?

@agramfort @kegl

agramfort commented 4 years ago

indeed the line:

y_proba = np.clip(y_proba, 0, 1)

is useless