Inference on a single audio input

Sreyan88 / GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

https://sreyan88.github.io/gamaaudio/

Apache License 2.0

80 stars 8 forks source link

Inference on a single audio input #18

Open jrohsc opened 23 hours ago

jrohsc commented 23 hours ago

Hi,

I tried to modify thegama_inf.py code to test the model on a single audio input. However, I am encountering some issues with the following: ValueError: The following "model_kwargs" are not used by the model: ['audio_input'] (note: typos in the generate arguments will also show up in this list).

Is there a way to fix this error? Or do you guys possess any code that we can test the model on a single audio input?

sonalkum commented 22 hours ago

Can you share your gama_inf.py file so that we can try to reproduce this issue? We have never encountered this issue. You can also check https://huggingface.co/spaces/sonalkum/GAMA here for better clarity on how to infer on single audio.

jrohsc commented 22 hours ago

Is there a way to do it on a code instead of a demo? A notebook file would be helpful!

Sreyan88 commented 22 hours ago

Hi @jrohsc ,

I am thinking you might have missed installing our version of Transformers. Please check installation instructions here:

https://github.com/Sreyan88/GAMA?tab=readme-ov-file#setup-%EF%B8%8F

jrohsc commented 21 hours ago

Here is my ipynb code I used on my local machine. Assume that this is in the GAMA directory. https://drive.google.com/file/d/1biav6H9EEvDzPG4GSt1CdSPvnUIJs00V/view?usp=sharing

jrohsc commented 16 hours ago

Hi, I even used the python script but I got the same error even I installed based on the instruction. I downloaded the Llama-2-7b-chat-hf-qformer folder as the base model and used the checkpoint-2500/pytorch_model.bin as the eval model path. What could be the potential reason for the above error?