jazcollins / amazon-berkeley-objects

36 stars 4 forks source link

Object retrieval dataset #4

Closed hayasick closed 1 year ago

hayasick commented 1 year ago

Hi, thank you for creating a great dataset!

I'm curious to evaluate my model on the object retrieval task described in your paper (section 4.3), but I can't find the dataset used in the experiment. Do you have any plans to release the dataset?

mguillau commented 1 year ago

Hi,

Thanks for your interest in the retrieval task. You can already find the dataset splits here: http://amazon-berkeley-objects.s3.amazonaws.com/benchmarks/abo-mvr.csv.xz

After decompression, you'll get a CSV file with the following columns:

Let us know if you have more questions.

hayasick commented 1 year ago

Thank you very much! I was able to download the file.

I'd like to know more about the meta column.

For "image", the metadata is in the form "item_id:main|other".

Does item_id correspond to the item id of a rendered image with the same class? And what does main|other mean? I've checked the paper, but I can't find a description of this.

mguillau commented 1 year ago

Class ids are essentially groups of item_id . An item_id corresponds to a single product, but sometimes products are very similar or even visually identical (e.g. internal color, invisible specs, etc.), so we used a clustering technique to group them (this grouping is not perfect and some errors might occur). But indeed, item_id for images or renders should correspond.

About main vs other: main is simply the first and main image displayed for that product on the web page, the rest of the images are marked other, without any particular order.

hayasick commented 1 year ago

I see, I completely understand. Thanks!