ParisNeo / lollms-webui

Lord of Large Language Models Web User Interface
https://lollms.com
Apache License 2.0
4.37k stars 555 forks source link

its posible to use gpu? #141

Open senkron24 opened 1 year ago

senkron24 commented 1 year ago

its posible to use gpu?

ParisNeo commented 1 year ago

Yes, but i need to use a gpu compatible backend. Backends are very easy to build. All you have to do is look at llamacpp backend example and create another folder in backends folder then implement another backend using a library that supports gpu. I have one using hugging face transformers but it requires a beast pc as it doesn't run quantized models and the smallest of non quantized models weighs 24Gb

andzejsp commented 1 year ago

how does the oogabooga does it then? with his repo you can allocate the ammount of Vram you have, for instace you got 12gb but you allocate only 10gb of gpu vram and then use either cpu RAM or swap.. thats how i was able to load larger models there but its not reliable.. sometimes models takes twice as much ram space than it does when its on disk. Either ways, looking forward to GPU usages because its waaay faster :) to generate than on CPU

ParisNeo commented 1 year ago

Yes, they have a file that contains the loading of all that jucy stuff.

But I don't want to copy their code. Someone can create a repo with the backend adaptation and I can add it as a possible backend to the ui.

There are literally three functions to implement.

senkron24 commented 1 year ago

I really wish I had the skills or knew how to do this because I believe it would make these tasks much faster. As of now, my computer specs are: CPU: 13th Gen Intel(R) Core(TM) i7-13700F 2.10 GHz RAM: 32.0 GB GPU: 1080 TI

However, even with these specs, I'm still experiencing delays and hang-ups in many responses, which is a bit frustrating.

andzejsp commented 1 year ago

Well there is always time to start getting knowledge, learning new stuff,contributing to community. Google, chatgpt, build, test, and commit

chongy076 commented 1 year ago

it seem i missed all the fun, but i am having eye's problem. still in the mid of recovering.

oogabooga , I had seen their code and tested. have to said it had many issues to get it to run due to the max mem size for certain computer and GPU. yes, they had GPU and non GPU one, but pretty good.

I may need pause for a moment until my eye's are recovered. Sorry for the inconvenient. It came at the wrong timing.

but the code i had placed in the previous issues just as a back up. if anyone wanted to have it early for https and proxy i had attached the code.

but front .js still need some works to change and attention needed to add that https and proxy.

if we didn't use socketio, the fast way will be https://pypi.org/project/waitress/.

this would be easy for small or medium size request, since we are using socketio we have to use below code.

[this is before the latest backend changed] app.txt https_main.txt