metadata.json files are missing

Franck-Dernoncourt commented 4 years ago

https://github.com/facebookresearch/simmc/tree/master/data#overview-of-the-dataset-repository (mirror) mentions:

JSON format: ./{domain}/metadata.json

I don't see any metadata.json file.

shanemoon commented 4 years ago

Hi Franck, thank you for your interest in participation and sorry for the delayed response. The files have been added at ./data/simmc_fashion/fashion_metadata.json and ./data/simmc_furniture/furniture_metadata.csv

Franck-Dernoncourt commented 4 years ago

Thanks! I'm guessing that in https://github.com/facebookresearch/simmc/blob/master/data/simmc_fashion/fashion_metadata.json on line 2 the "278" corresponds to the image_id (https://github.com/facebookresearch/simmc/tree/master/data#data-format says it's object_id, so to rephrase the question does object_id == image_id)?

shubhamagarwal92 commented 4 years ago

Thanks @Franck-Dernoncourt @shanemoon for the discussion.

Could you provide the prefix for the image url in the fashion_metadata.json? It seems like a hash (eg. url:"GJkR2gDcSJhuxW4BAAAAAAATI4IfbtAUAAAB") As @Franck-Dernoncourt mentioned, how could we align the images with corresponding dialogs (visual_objects in the dialog json doesn't have any identifier of the corresponding image)
Similarly, what is the identifier in the furniture domain for the glb object?

seo-95 commented 4 years ago

I think that the alignments between the dialogues visual_objects and the real objects in the fashion_metadata.json can be found in the dialogue_coref_map associated to each dialogue. I leave you here an example extracted from the first dialogue in fashion_train_dials.json

"belief_state":  [{'act': 'DA:ASK:CHECK:CLOTHING.pattern', 'slots': [['fashion-O', 'OBJECT_0'], ['fashion-attentionOn', 'this']]}]
"visual_objects": {'OBJECT_0': {'color': ['white'], 'embellishments': ['pointelle'], 'pos': 'focus', 'sleeveStyle': ['long_sleeve'], 'sweaterStyle': ['duster', 'kimono', 'loose', 'crochet'], 'type': 'sweater'}}

"dialogue_coref_map": {'1426': 0, '1429': 1}

The dialogue_coref_map in this example has to be interpreted in this way: OBJECT_0 corresponds to object 1426 in fashion_metadata.json and OBJECT_1 to object 1429. This is what I have understood.

Thanks @Franck-Dernoncourt @shanemoon for the discussion.

1. Could you provide the prefix for the image url in the `fashion_metadata.json`? It seems like a hash (eg. `url:"GJkR2gDcSJhuxW4BAAAAAAATI4IfbtAUAAAB"`) As @Franck-Dernoncourt mentioned, how could we align the images with corresponding dialogs (`visual_objects` in the dialog json doesn't have any identifier of the corresponding image)

2. Similarly, what is the identifier in the furniture domain for the glb object?

shubhamagarwal92 commented 4 years ago

@seo-95 Thanks for the detailed response.

@shanemoon Could you please confirm

For fashion domain, should we expect any visual content?
For furniture,

fashion_dev_dials.json
"dialogue_coref_map": {'763066': 0, '763118': 1}

furniture_metadata.csv
obj: "http://img.wfrcdn.com/docresources/36069/76/763066.zip"

obj name is the identifier?

skiingpacman commented 4 years ago

Hi, and thanks for you interest in the SIMMC track.

On (1), for fashion, I can confirm that we currently are not planning to make the original images public. Re. (2) yes, the furniture 'prefab' id in the dialog_coref_map (and elsewhere) refers to the obj url id, the last digits just before the '.zip', e.g.

obj: "http://img.wfrcdn.com/docresources/36069/76/*763066*.zip"

facebookresearch / simmc

metadata.json files are missing #2