I'm new to WSOL. In my opinion, the WSOL task aims to locate and classify a single instance in one image.
However, In ImageNet val dataset (.xml, download from ImageNet offical website), there may be multiple instances in one image.
And in your ImageNet val list, only single instance in one image.
So, I want to know what kind of processing method you used to get your val list. And, is the processing method widely used?
I compare labels/ILSVRC/val.txt with the raw xml file, it seems that the method to handle multiple boxes is to select box of the first instance in xml file.
Hi, thanks for your excellent code.
I'm new to WSOL. In my opinion, the WSOL task aims to locate and classify a single instance in one image. However, In ImageNet val dataset (.xml, download from ImageNet offical website), there may be multiple instances in one image. And in your ImageNet val list, only single instance in one image. So, I want to know what kind of processing method you used to get your val list. And, is the processing method widely used?
Looking forward to your reply.