McGill-NLP / AURORA

Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
https://aurora-editing.github.io/
MIT License
12 stars 2 forks source link

ImagenHub Integration #9

Open vinesmsuic opened 1 month ago

vinesmsuic commented 1 month ago

Very cool work!

We added the implementation of AURORA on ImagenHub (ICLR 2024). Feel free to checkout the results on our ImagenHub visualization and verify the code implementation :)

Visualization: https://chromaica.github.io/Museum/ImagenHub_Text-Guided_IE/ Code integration: https://github.com/TIGER-AI-Lab/ImagenHub/pull/35 Best, Max

BennoKrojer commented 1 month ago

Hi Max,

That's awesome! Great community effort from your end to unify everything so nicely. Out of curiosity, how many of the prompts are "in favor" of where AURORA shines (shines is maybe a strong word since these edits are very hard but let's say "doesn't fail 99% of the time") which is spatial and action editing?

Best, Benno

BennoKrojer commented 1 month ago

As a suggestion, you could integrate some of the examples from AURORA Bench that would make it more diverse in terms of skills/reasoning? They are all easily available on the github here: https://github.com/McGill-NLP/AURORA/blob/main/test.json

BennoKrojer commented 1 month ago
image

Here you can quickly see which tasks cover skills that were previously not much covered (AG, Something Something, Epic Kitchen, WhatsUp, CLEVRER, Kubric). Some might be too niche for your purpose but AG, Something Something and WhatsUp are quite general skills you would want from an editing model