A simple question - Githubissues

Sreyan88 / GAMA

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Apache License 2.0

80 stars 8 forks source link

Hi! Thank you for your excellent work! When I inference GAMA, I encountered the same question with LTU: I load an audio from the eval set of AudioSet, in which a man is speaking. When I ask "Describe the audio.", GAMA returns the precise answer "Audio caption: A man is speaking and beeping his car keys as he gets out of his car and walks away to open something.". However, when I ask "Determine the gender of the speaker." and "Who's speaking? A man or a woman?", GAMA returns "The gender of the speaker is not specified." and "It is not specified in the audio clip who is speaking. It could be either a man or a woman.". May I ask what caused this strange situation?

Sreyan88 / GAMA

A simple question #10