YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
389 stars 36 forks source link

Inference of 13B (Beta) #49

Open nicolaus625 opened 2 months ago

nicolaus625 commented 2 months ago

Is there any client API for LTU-AS (13B) ?

I cannot find the 13B checkpoints in the GitHub repo. And the API only support "7B (Default)" and does not support "13B (Beta)"

nicolaus625 commented 2 months ago

also, the client = Client("https://yuangongfdu-ltu-2.hf.space/") does not support 13B.

And the

response = requests.put('http://sls-titan-7.csail.mit.edu:8080/items/0', json={
        'audio_path': audio_path, 'question': question
    })

has a similar error with previous issue https://github.com/YuanGongND/ltu/issues/2

The solution mentioned in previous method only works for "7B (Default)" setting

YuanGongND commented 2 months ago

yes, the 13B server is down because I left MIT and cannot use too much resources. However the checkpoint is opensourced, you can infer with your own gpu.

Sorry for the inconvenience.

nicolaus625 commented 1 month ago

ltu-main/src/ltu_as/inference_gradio.py

Is the demo in the previous code is the 7B (Default) or 13B (Beta)? And where can I find the demo for another ckpt?