turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 236 forks source link

Is LoRa working as the same as exllamav2 loader in text-generation-webui? #343

Open LiangA opened 4 months ago

LiangA commented 4 months ago

Just as title, I tested my lora in text-generation-webui and it went as I want, but my main program is totally built with exllamav2 so I started integrating lora into my exllamav2 project. When I tried to run lora by using lora.py in exllamav2 repo, I found that the outputs were totally different from the outputs generated by text-generation-webui.

Can anyone elaborate what's the difference between 1.when I load model with exllamav2 and then load lora in text-generation-webui 1.when I load model with example code lora.py in exllamav2 repo

Any thought will be very appreciated!