Evaluation based on each image vs entire dataset

hendrycks / anomaly-seg

The Combined Anomalous Object Segmentation (CAOS) Benchmark

MIT License

154 stars 20 forks source link

Evaluation based on each image vs entire dataset #21

Closed tianyu0207 closed 3 years ago

tianyu0207 commented 3 years ago

Hi,

Congrats on the great paper.

I checked your evaluation method and found that the results based on the mean of fprs over all images are better than the results based on flattened scores and labels of the entire dataset. Why do you use image-wise results instead of the entire dataset to calculate the fpr.

Thanks.

hendrycks commented 3 years ago

I do not think we had a particular reason to aggregate across images. Both strategies seem valid given a moment's thought. There may have been memory difficulties in aggregating across the whole dataset.