OSU-NLP-Group / MagicBrush

[NeurIPS'23] "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
https://osu-nlp-group.github.io/MagicBrush/
Other
313 stars 14 forks source link

eorror in evaluation set #15

Closed betterze closed 7 months ago

betterze commented 7 months ago

Dear MagicBrush team,

Thank you for sharing this good dataset.

The source prompt is the same as the target prompt in global_descriptions.json in some examples. For example:

'294330-output2.png': 'A home office with a desk, leather bound books, and computers displaying code.',
 '294330-output3.png': 'A home office with a desk, leather bound books, and computers displaying code.'

And the instruction does not sound like 'instruction' in edit_sessions.json. For example,

{'input': '43917-output1.png',
 'mask': '43917-mask2.png',
 'output': '43917-output2.png',
 'instruction': 'It should have french fries on the plate.'}

"It should have french fries on the plate." This is not an instruction. Why not just use 'add french fries on the plate'?

Thank you for your help.

Best Wishes,

Zongze

drogozhang commented 7 months ago

Hello Zongze,

Thanks for your interest and questions. We used ChatGPT to generate the global description without manual editing so feel free to edit or re-generate if you are not satisfied with the quality.

In terms of the instruction, we believe the instruction should not be just starting with some verb words or fixing some patterns. As a matter of fact, many vision domain people even regard questions as instruction (see BLIP2 and LLaVa paper).

As long as the "instruction" can clearly express users' edit intents then we think they are qualified to be kept in the dataset. Also, we want the instructions to be as diverse as possible to allow any forms of instruction and increase models' robustness to instruction-following.

Let me know if you have any other questions.

Best, Kai

betterze commented 7 months ago

get it. thx a lot.