CurryYuan / PhraseRefer

Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
MIT License
12 stars 0 forks source link

Problem about datasets. #2

Open YYLiuDLUT opened 9 months ago

YYLiuDLUT commented 9 months ago

Your work is excellent and has been of great help to my research, but I encountered a problem while handling the dataset. By merging ann_id, I obtained only 29,538 sentences in the training set and over 7,000 sentences in the test set. However, the paper mentions 36,665 sentences. Is there a problem with my approach, or could you provide a dataloader?

CurryYuan commented 9 months ago

Hi, @YYLiuDLUT , please refer to the following script to organize the annotations by sentence

referit_csv = f'referit3d/data/{dataset}.csv'
referit_data = pd.read_csv(referit_csv)

with open(f'referit3d/data/scanrefer_entities_row.json', 'r') as f:
    scanrefer_entities = json.load(f)

entites = {}

for index, row in tqdm(referit_data.iterrows()):

    caption = row['utterance']
    scene_id = row['scan_id']
    target_id = str(row['target_id'])
    ann_id = str(row['ann_id'])

    phrases = []
    for phrase_anno in scanrefer_entities:
        if phrase_anno['scene_id'] == scene_id and phrase_anno['object_id'] == target_id and phrase_anno['ann_id'] == ann_id:
            phrases.append({
                'labeled_id': phrase_anno['labeled_id'],
                'position_start': int(phrase_anno['position_start']),
                'position_end': int(phrase_anno['position_end']),
            })
YYLiuDLUT commented 9 months ago

Thank you for your reply. I forgot to mention that I encountered this issue while dealing with scanrefer++ files. Perhaps, I should handle the scanrefer data differently. I am also merging the data by judging phrase_anno['scene_id'] == scene_id and phrase_anno['object_id'] == target_id and phrase_anno['ann_id'] == ann_id, but the number of sentences obtained differs significantly from what is stated in the paper.

CurryYuan commented 9 months ago

Hi @YYLiuDLUT, thank you for bringing this to our attention. It appears that some annotations have been filtered out in the current file. We are looking into this to identify and resolve the issue as soon as possible. Meanwhile, we recommend continuing with the available annotations. We appreciate your patience and understanding in this matter.

YYLiuDLUT commented 9 months ago

Thank you for your prompt reply. Looking forward to your future updates.

YYLiuDLUT commented 9 months ago

I rechecked the dataset, and it seems that all the data where the targets are unique in the scanrefer dataset are missing. This makes it challenging for me to fairly compare with other methods. Is it possible for you to release this portion of the data as soon as possible?

CurryYuan commented 9 months ago

What is the meaning of fair comparison? The unique object does not need the anchor object as a reference, so it can be treated as the object name of the target object.

YYLiuDLUT commented 9 months ago

Even though the target in the sentence is unique, there are other objects present in the sentence as well. Is it correct that the loss function in your paper requires simultaneous comparison of both target and non-target objects? I'm not sure if my understanding is accurate. Here is a sample of unique utterance: "the highlighted door is white in color and is closed . when you enter two trash cans can be seen on your right ."

YYLiuDLUT commented 9 months ago

At the moment, all the unique utterances in the dataset are missing. So, if I directly merge the unique data from the scanrefer dataset with yours, I won't have ground truth for non-targets for supervision. So I'm wondering if you could release the remaining data of your dataset soon.