cvcode18 / imbalanced_learning

102 stars 20 forks source link

WIDER database preprocessing #1

Closed AliceLcz closed 5 years ago

AliceLcz commented 5 years ago

@cvcode18 Hi! Thanks for providing the codes for attribute detection! I have a question about how to handle the attribute labels in WIDER, which has 3 values:1, 0, -1. I want to know how to deal with the unspecified labels in WIDER database preprocessing. And in loss computation and mAP evaluation, what should I do with the multi-value labels. I'm new to this field, can you give me some advice? Thanks a lot!

cvcode18 commented 5 years ago

The WIDER dataset indeed provides 3 types of attributes. 1 for positive, -1 for negative and 0 for uncertain. Most works including this one treat at training time the uncertain as negative (to have more samples) and at validation/testing ignore the uncertain samples and compute the mAP based on only the positives and the negatives.

Note that the sklearn.metrics.average_precision_score takes binary labels as an input. This means that once you've mapped the training labels that are 0 to -1 you just need to change the values of all -1s to 0. In that way positive will be 1 and negative will be 0. For the validation/testing during what you need to do for each attribute is ignore the indices of the vector that have value 0 (uncertain samples) and then for the remaining map the -1s to 0.

Hope this helps. Welcome to the field

AliceLcz commented 5 years ago

Oh, I'm grateful for your quick reply! It's very kind of you to explain the problems that puzzled me mostly.

Now, I try to map the training labels from -1 to 0 in loss computing. And for training/validation/testing during mAP evaluation, I will eliminate the 0 labels first, and then map the -1s to 0.

Thanks!