Open thoth000 opened 1 month ago
I understand that in this implementation, the metrics are averaged over each image in the add() method. Specifically, the metrics like accuracy, F1 score, etc., are computed for each image and then averaged across all images.
Here is the relevant part of the code:
class AverageMeter(object):
def add(self, val, weight):
self.val = val
self.sum = np.add(self.sum, np.multiply(val, weight))
self.count = self.count + weight
self.avg = self.sum / self.count
Is this implementation correct for calculating overall performance, or should the metrics be computed after accumulating TP, FP, TN, and FN across the entire dataset?
Hi,
I believe there might be an issue with the implementation of
AverageMeter()
for managing evaluation metrics.https://github.com/lseventeen/FR-UNet/blob/master/utils/metrics.py
Issue:
From my understanding, the current implementation calculates metrics such as accuracy (Acc), sensitivity (Sen), F1 score, etc., for each image individually, and then averages these values across all images. Please correct me if my understanding is wrong.
Concern:
In practice, however, the correct approach would be to accumulate the true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) across the entire dataset and then calculate the final metrics (e.g., accuracy, sensitivity) based on these aggregated values.
Suggestion:
I recommend revisiting the implementation to ensure that the evaluation is done by aggregating TP, FP, TN, and FN over the entire dataset, as this provides a more accurate representation of the model's performance.
Thank you for your attention to this issue. If my understanding is incorrect, anyone is welcome to correct me. Best regards,
[Yuma]