fawazsammani / nlxgpt

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)
44 stars 10 forks source link

How to run pretrained model VQA-X with single image in Google Colab #1

Closed MachinicGlitch closed 2 years ago

MachinicGlitch commented 2 years ago

Hi!

I am wanting to run the pretrained VQAX_p model (the same one that is used in the 'Explanations with Natural Text' Hugging Face demo). I have the project files and dependencies all imported into Google Colab, but I am unsure which files/functions I should be using in order to get explanation on one single image.

Ellyuca commented 2 years ago

Hi @MachinicGlitch, Where you able to figure it out? I am also interested in using the pretrained model on ACT-X to get the explanations for a single image.

Any ideas? I really appreciate any tips and advice. Thanks.

MachinicGlitch commented 2 years ago

Yes I did figure it out, feel blind in hindsight but the hugging face app actually has an attached codebase with exactly what I was looking for:

https://huggingface.co/spaces/Fawaz/nlx-gpt/tree/main

Should be what you need here in app.py, and the VQAX_p directory has the pretrained model.

fawazsammani commented 2 years ago

Dear @MachinicGlitch and @Ellyuca , I'm really sorry for the late reply. I was pressured with work in the past month.

It's good that you figured it out! Let me know if there are other problems.

Ellyuca commented 2 years ago

Hi @fawazsammani, I am trying to run the pretrained model on the ACT_X using a single image. So given an image, I want to get the activity recognition label and the explanation for it. @MachinicGlitch kindly pointed me to the code used for the VQAX demo.

I was wondering how can I adapt that code for the ACT_X task? Can I just use it as it is and have the question be an empty string (something like "")? Do you perhaps have a tutorial/demo for the ACTX as well?

Thank you for your time and any tips and advice you can give me.

PS: @MachinicGlitch thank you as well.

fawazsammani commented 2 years ago

Hi @Ellyuca , Yes I had one for ACT-X which is 95% done, i stopped because I was busy the last month. I will finish it up today and send the link here as well as in the README. Maybe tonight

Best Regards

Ellyuca commented 2 years ago

Dear @fawazsammani,

Thank you! that will help me a lot! I really appreciate it!

Best wishes!

fawazsammani commented 2 years ago

@Ellyuca you can find it here: https://huggingface.co/spaces/Fawaz/nlx_gpt_action

Ellyuca commented 2 years ago

@fawazsammani Thanks a lot!!!