Object retrieval dataset

hayasick commented 1 year ago

Hi, thank you for creating a great dataset!

I'm curious to evaluate my model on the object retrieval task described in your paper (section 4.3), but I can't find the dataset used in the experiment. Do you have any plans to release the dataset?

mguillau commented 1 year ago

Hi,

Thanks for your interest in the retrieval task. You can already find the dataset splits here: http://amazon-berkeley-objects.s3.amazonaws.com/benchmarks/abo-mvr.csv.xz

After decompression, you'll get a CSV file with the following columns:

split_set : either "train", "val-target", "val-query", "test-target", "test-query" or "test-extra-gallery". For training, use "train" alone. For validation, use "train" and "val-target" as index and "val-query" as queries. For testing, use "train", "val-target", "val-query", "test-target" and "test-extra-gallery" as index and "test-query" as queries.
class_id : the class label for each sample, used for computing recall and other metrics
group_id : a higher-level grouping of related products, used to ensure they remain in the same splits
image_id: a unique image identifier
image_type: either "image" (a real product image) or "render" (a blender render of the 3d model)
path: the path to the image, relative to the prefix which depends on image_type: if image_type is "render", use abo_material/ as prefix to the image path, otherwise use images/small/
meta: metadata information, whose content also depends on image_type: for "render", it has the form "azimuth:X+elevation:Y" where X and Y are numbers that correspond to the bucket for azimuth and elevation (as used in Fig 8 of the paper). For "image", the metadata is in the form "item_id:main|other".

Let us know if you have more questions.

hayasick commented 1 year ago

Thank you very much! I was able to download the file.

I'd like to know more about the meta column.

For "image", the metadata is in the form "item_id:main|other".

Does item_id correspond to the item id of a rendered image with the same class? And what does main|other mean? I've checked the paper, but I can't find a description of this.

mguillau commented 1 year ago

Class ids are essentially groups of item_id . An item_id corresponds to a single product, but sometimes products are very similar or even visually identical (e.g. internal color, invisible specs, etc.), so we used a clustering technique to group them (this grouping is not perfect and some errors might occur). But indeed, item_id for images or renders should correspond.

About main vs other: main is simply the first and main image displayed for that product on the web page, the rest of the images are marked other, without any particular order.

hayasick commented 1 year ago

I see, I completely understand. Thanks!

jazcollins / amazon-berkeley-objects

Object retrieval dataset #4