stnoah1 / mcb

A large-scale mechanical components benchmark for the classification and retrieval tasks named Mechanical Components Benchmark (MCB)
https://mechanical-components.herokuapp.com/
MIT License
49 stars 4 forks source link

Evaluation metrics on MCB. #7

Open GostInShell opened 3 years ago

GostInShell commented 3 years ago

Hi! Thank you for your contribution of this great dataset! I have some questions about the evaluation metrics mentioned in the paper.

For N in P@N, R@N, F1@N, and NDCG@N, the paper says that N is the total length of the retrieve list. Is N defined as the number of the objects in the class that the query object belongs to?

stnoah1 commented 3 years ago

Hello, Thank you for your interest in my work.

N on the Retrieval benchmark referred to as the smaller number between the total number of objects in the category or maximally allowed retrieved list length (100 in our experiment).

Best, Hyung-gun Chi

GostInShell commented 3 years ago

@stnoah1 Thank you for your reply!

Could you also kindly explain why the Precision@N and Recall@N are different under micro- and macro situation?

Hope I get this right: Mirco directly calcualtes the mean Precision@N and Recall@N of all objects. Macro first calculates the mean in each class then calculates another mean over all classes as the result.

I think the reason for that Precision@N and Recall@N for the dataset are equal is that the Precision@N and Recall@N for each query object is equal. Then shouldn't the Precision@N and Recall@N be equal regardless of Micro and Macro?

These problems really confuse me for a while. Thank you in advance!