Can this be used, for example, in a quantised version?

shansongliu / MU-LLaMA

MU-LLaMA: Music Understanding Large Language Model

GNU General Public License v3.0

221 stars 16 forks source link

Can this be used, for example, in a quantised version? #15

Closed 0417keito closed 11 months ago

0417keito commented 11 months ago

Can this be used, for example, in a quantised version? I tried to do inference, but it is out of memory and cannot be used.

shansongliu commented 11 months ago

Currently we haven't implemented quantized version of our MU-LLaMA model. Welcome to make a pull request if you plan to do so. Now the working version is 1 minute music file inferenced on a 32GB V100 GPU.

Can this be used, for example, in a quantised version? I tried to do inference, but it is out of memory and cannot be used.

0417keito commented 11 months ago

Thank you for your reply. Noted.

0417keito commented 11 months ago

I propose LLaSA, which leverages the generic architecture of LLaVA to align different features, making it compatible with features like MERT and HuBERT. I believe this can lead to a multi-modal LLM that can understand music and audio. What do you think?

https://github.com/0417keito/LLaSA

shansongliu commented 11 months ago

I propose LLaSA, which leverages the generic architecture of LLaVA to align different features, making it compatible with features like MERT and HuBERT. I believe this can lead to a multi-modal LLM that can understand music and audio. What do you think?

https://github.com/0417keito/LLaSA

Yeah, I think this is possible.

0417keito commented 11 months ago

I don't have the resources to train a 7B-level LLM. I might be able to train a 1.3B-level model, but I believe the accuracy would be lower. So, I really don't mean to be rude, but do you have the capacity to train this within the MU-LLaMA project?

shansongliu commented 11 months ago

I don't have the resources to train a 7B-level LLM. I might be able to train a 1.3B-level model, but I believe the accuracy would be lower. So, I really don't mean to be rude, but do you have the capacity to train this within the MU-LLaMA project?

There are plenty of memory-saving methods which you may take a look at, such as gradient accumulation, FSDP, etc., which may help you resolve the memory issues. For the capacity you mentioned, I'm afraid we don't have extra time and resources to handle issues which are irrelavant to our project. Sorry for that.

0417keito commented 11 months ago

Thank you for considering my request and for the valuable information. I apologize for any inconvenience my question may have caused. I sincerely wish you all success. Thank you.

shansongliu commented 11 months ago

Thank you for considering my request and for the valuable information. I apologize for any inconvenience my question may have caused. I sincerely wish you all success. Thank you.

Not a big deal. If you have any further questions, I'm willing to response.