clevaway / J.A.R.V.I.S

Jarvis model: A Fine-tune of llama2, works 100% offline with Ollama
MIT License
68 stars 12 forks source link

so slow #9

Open cozmo14047 opened 3 months ago

cozmo14047 commented 3 months ago

Why is it so slow

FotieMConstant commented 3 months ago

Hey @cozmo14047 could you please provide more information on what exactly is slow?

for context, the project is still under development and it’s normal if you encounter some bugs etc. currently the tts makes a request online and voice recognition isn’t the best, yet.

please bear with me as I keep improving inference and the overall model.

also, it’s advisable to run it on a 16gb ram minimum and tests haven’t been made yet for other OSs other then MacOS 14.4.x

cozmo14047 commented 3 months ago

hey sorry there was ment to be way more text i accedenillt clicked enter, so 2 of us have tried to run it and its really slow, both on high powered pcs. it takes about 4 min to respond and it responds 1 word every minuite. even a siple what time is it.

FotieMConstant commented 3 months ago

@cozmo14047 thanks for the heads up on this, could you please provide more content on the OS, system and version, RAM etc?

does ollama works well on those pcs? And are you able to run Jarvis in the command line with the command as seen here: https://ollama.com/fotiecodes/jarvis ?

please let me know, also the more content I have the better I will be able to help troubleshoot any issue you might have:)

cozmo14047 commented 3 months ago

@cozmo14047 thanks for the heads up on this, could you please provide more content on the OS, system and version, RAM etc?

does ollama works well on those pcs? And are you able to run Jarvis in the command line with the command as seen here: https://ollama.com/fotiecodes/jarvis ?

please let me know, also the more content I have the better I will be able to help troubleshoot any issue you might have:)

hi thank you, windows 11 and 10 64 gb ram ollama works well and code works fine its just super slow

cozmo14047 commented 3 months ago

for what we wanted to use it for response time needs to be under 3 seconds and im guessing that isint possbile

FotieMConstant commented 3 months ago

@cozmo14047 thanks for the heads up on this, could you please provide more content on the OS, system and version, RAM etc? does ollama works well on those pcs? And are you able to run Jarvis in the command line with the command as seen here: https://ollama.com/fotiecodes/jarvis ? please let me know, also the more content I have the better I will be able to help troubleshoot any issue you might have:)

hi thank you, windows 11 and 10 64 gb ram ollama works well and code works fine its just super slow

that's rather unexpected, although it has not been tested on windows if ollama works fine, you shouldn't have any issues.

FotieMConstant commented 3 months ago

for what we wanted to use it for response time needs to be under 3 seconds and im guessing that isint possbile

thanks for the feedback on this, it works super good on MacOS 14.4.x apple silicon with 16GB ram, 8 core GPU with a relatively better response time. the goal is to improve this of course as much as i can, so please keep an eye for updates. i might need to have someone help with testing on windows and linux as well.

cozmo14047 commented 3 months ago

Well what do you mean works fine it doesnt throw errors and been tried on 2 diffrwnt pcs in 2 diffrent countrys

FotieMConstant commented 3 months ago

Well what do you mean works fine it doesnt throw errors and been tried on 2 diffrwnt pcs in 2 diffrent countrys

i am referring to ollama here, jarvis uses ollama which runs the llm locally after being downloaded from the hub

FotieMConstant commented 3 months ago

@cozmo14047 thanks for the heads up on this, could you please provide more content on the OS, system and version, RAM etc? does ollama works well on those pcs? And are you able to run Jarvis in the command line with the command as seen here: https://ollama.com/fotiecodes/jarvis ? please let me know, also the more content I have the better I will be able to help troubleshoot any issue you might have:)

hi thank you, windows 11 and 10 64 gb ram ollama works well and code works fine its just super slow

also, i will have to reproduce this error to better understand what we have here. Unfortunately i do not possess a windows computer with those specs.

cozmo14047 commented 3 months ago

Well what do you mean works fine it doesnt throw errors and been tried on 2 diffrwnt pcs in 2 diffrent countrys

i am referring to ollama here, jarvis uses ollama which runs the llm locally after being downloaded from the hub

Yeah its not throwing any errors so seems to be fine

FotieMConstant commented 3 months ago

Well what do you mean works fine it doesnt throw errors and been tried on 2 diffrwnt pcs in 2 diffrent countrys

i am referring to ollama here, jarvis uses ollama which runs the llm locally after being downloaded from the hub

Yeah its not throwing any errors so seems to be fine

and the speed is good? i mean when using ollama? also are you able to chat with jarvis from the command line when you run the command

ollama run fotiecodes/jarvis
cozmo14047 commented 3 months ago

Dint know how to run ollama alone but using that command i can chat its just really slow

FotieMConstant commented 3 months ago

Dint know how to run ollama alone but using that command i can chat its just really slow

Hi, thanks for this feedback, i think the issue is with you ollama itself, if jarvis is slow when you chat with it from the command line using the above command, you computer probably isn't capable to run llms. this is not a particular issue related to this project.

which is rather strange as you have a 64gb ram. what GPU you using please? NVIDIA's RTX?

FotieMConstant commented 3 months ago

@cozmo14047 are you able to run other llms with ollama? what is their various speeds?

cozmo14047 commented 3 months ago

Dint know how to run ollama alone but using that command i can chat its just really slow

Hi, thanks for this feedback, i think the issue is with you ollama itself, if jarvis is slow when you chat with it from the command line using the above command, you computer probably isn't capable to run llms. this is not a particular issue related to this project.

which is rather strange as you have a 64gb ram. what GPU you using please? NVIDIA's RTX?

No idea what gpu but both pcs are less the a year old and ver powerful

cozmo14047 commented 3 months ago

@cozmo14047 are you able to run other llms with ollama? what is their various speeds?

im not sure any commands that i can use to run other llms?

cozmo14047 commented 3 months ago

ok im trying another model now

cozmo14047 commented 3 months ago

ok so mine doesent like running other ones either so ill try again tommorow and ask my freind to test it to, seems like it could indeed be ollamam ill let you know tommorow if it is or not

FotieMConstant commented 3 months ago

If you are having any specific issues or concerns related to ollama, feel free to open an issue in their repo: https://github.com/ollama/ollama

in the main time, please keep me posted;)

brainiakk commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

FotieMConstant commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

I agree with you

cozmo14047 commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

Hmm well our pcs are top line ill find out tommorow what processior i have, this isint going to be good for what i needed it for anyway, so thanks

cozmo14047 commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

I agree with you

@FotieMConstant so the othe ollema ones work fine apart from the 40 gb one

FotieMConstant commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

I agree with you

@FotieMConstant so the othe ollema ones work fine apart from the 40 gb one

hi @cozmo14047, sorry I didn't quite catch that. you mean the models?

cozmo14047 commented 3 months ago

@cozmo14047 I think it's more about your PC's CPU/Processor or GPU than the RAM, but the RAM is important too. I'm using a 16GB Macbook Pro M1 processor and it's pretty fast with ollama and even LMstudio API

I agree with you

@FotieMConstant so the othe ollema ones work fine apart from the 40 gb one

hi @cozmo14047, sorry I didn't quite catch that. you mean the models?

Yes the other models all of theme work f8ne and fast except for the big 40 gb one

FotieMConstant commented 3 months ago

Thank for this feedback, will look into it, perhaps work with a much lightweight model? we'll see. will keep you posted