Open lelapin123 opened 1 year ago
The answers are pretty accurate (on one file that is a bit tricky to understand), but it is slower that privategpt
Are you running it on a GPU?
Are you running it on a GPU?
on windows: yes, i can see it Unless it loads in the GPU to actually use the CPU
Can you share details of your hardware and cuda? I will have a look at it.
Hardware is 3090 basic Cuda is 11.7
Look, i watched your video: Do you really need the visual studio environment to make this work ? I notice that the loading time of your model is around 12 seconds, while for my it is like 2'20.
If i check the memory, i notice that it is very slow to load in the memory and suddenly i receive an answer very fast. I think my problem is likely related to how the model loads in the memory of my computer. Is there a way to improve this ?
You are right, you don't need Visual Code Studio to make it work. It was just to show the code. But it's better to just directly run it in the terminal.
Not sure what could be causing this. In my case, I am loading it from an SSD. Not sure what your storage is. I can't really think of anything else at the moment. I will keep this open in case anyone else encounters this or we can figure something out.
Huggingface stores its model here: C:\Users\username\.cache\huggingface\hub
and my C drive is SSD too
Not sure what else it could be. Someone might have a better idea. Sorry couldn't help.
@lelapin123 I'm repeating myself: but give CASALIOY a try. It's faster than privateGPT and solves the issues those repos won't fix.
I'm also seeing very slow performance, tried CPU and default cuda, on macOS with apple m1 chip and embedded GPU. I see python3.11 process using 400% cpu (assuign pegging 4 cores with multithread), 50~ threds, 4GIG RAM for that process, will sit there for a while, like 60 seconds at these stats, then respond. Is it suppose to be this slow?
I'm also seeing very slow performance, tried CPU and default cuda, on macOS with apple m1 chip and embedded GPU. I see python3.11 process using 400% cpu (assuign pegging 4 cores with multithread), 50~ threds, 4GIG RAM for that process, will sit there for a while, like 60 seconds at these stats, then respond. Is it suppose to be this slow?
Personnaly, it took 25minutes to answer the same question à quoi sert in the walkthrought presentation of code. How have you obtain this nice performance ?
hello @PromtEngineer the localgpt takes too much time to give a result and i am using TP GPU in google colab
It takes a lot of time to get a result