YuanGongND / ltu

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
337 stars 27 forks source link

CPU local inference is not working. #23

Open vivekupadhyay1 opened 5 months ago

vivekupadhyay1 commented 5 months ago

I'm encountering a problem with the local inference of LTU/LTU_AS. I've modified the script for local inference to allow checking its output on any 16k WAV file, but I'm facing an error. I'm using an Ubuntu CPU-only machine.

The error message I'm encountering in LTU_AS is: "Caught an error: mat1 and mat2 must have the same dtype."

I also attempted with LTU, but when I utilize 'hf-dev/transformers-main' within the LTU folder, I encounter the error: "ModuleNotFoundError: No module named '_lzma'." If I use LTU_AS with 'hf-dev/transformers-main', I receive the following error:

"Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.audio_proj.1.weight: copying a parameter with shape torch.Size([4096, 768]) from the checkpoint, while the shape in the current model is torch.Size([4096, 1280])."

I've included both scripts I'm using for inference. They are attached as .txt files; please change their extension to .py. If there's anything I've missed or done incorrectly, please provide guidance.

Thank you!

inference_batch_ltu.txt inference_batch_ltu_as.txt