Some problem about calculating MAP

thuml / HashNet

Code release for "HashNet: Deep Learning to Hash by Continuation" (ICCV 2017)

MIT License

241 stars 84 forks source link

Some problem about calculating MAP #2

Closed akturtle closed 6 years ago

akturtle commented 7 years ago

Hi, it seems not precise how you calculate the MAP. I check your function "mean_average_precision". When the relevant_num ==0, this query will be ignored when calculating the mean. That eliminates some extreme bad case (no similar images is retrieved) and the MAP will be a little higher.

caozhangjie commented 7 years ago

I'm sorry for this bug. I have fixed it. But it doesn't have obvious impact on our result since it's unlikely for our method to find 0 relevant code in the database similar to a query.

akturtle commented 7 years ago

Hi, I have one more thing to discuss with you. The relevant_num for calculating average precision in your code is how many similar images in the query result. But from my understanding, the relevan_num is the number that how many similar images should be in the query result (retrieval all similar images). Take the imageNet experiment as an example, the relevant_num should be fixed to 1000

caozhangjie commented 7 years ago

In our code, the relevant_num is how many similar images within the closest 1000 images to the query image. MAP 1000 means that we retrieve the closest 1000 images and calculate the MAP.

akturtle commented 7 years ago

Think about this: If only the top 2 retrieval images are similar, the relevant_number= 2, then the average precision equals to 100%. But there are more than 1000 similar images in your database. The AP should be calculated as (1/1+2/2)/1000 = 0.2%. check this discussion

kunhe commented 7 years ago

AP is a ranking metric. If the top 2 retrievals in the ranked list are relevant (and only the top 2), AP is 100%. You're talking about Recall, which in this case is indeed 0.2%.

akturtle commented 7 years ago

@kunhe. Actually, I'm not pretty sure about this. Have you checked the above discussion? I think when we calculate the MAP, the recall should guarantee to equal to 100%.

kunhe commented 7 years ago

@akturtle If you want 100% recall, you should go down the ranked list until all relevant items are found, say at position K. Sure, you can compute MAP on the sublist from 1 to K, knowing that recall is 100%. But here MAP is computed by fixing K=1000, and 100% recall may or may not happen at that point.