Open GiuseppeZani opened 3 months ago
!following — I have an RTX 3060 and having model loading issues… Not sure if this is related
Same here.
@RiaGruen the original issue mentions version 0.2.31 which is NOT the newest version. The newest version is 0.3.2 available in https://lmstudio.ai.
What is the issue you are encountering? Please share screenshots / logs. Thank you.
Here's the log: [2024-09-01 16:20:16][INFO][LM STUDIO SERVER] Running chat completion on conversation with 1 messages. [2024-09-01 16:20:39][ERROR] Unknown exception during inferencing.. Error Data: n/a, Additional Data: n/a [2024-09-01 16:20:39][INFO][LM STUDIO SERVER] Client disconnected. Stopping generation..
I didn't encounter that with the previous version and also not with a different graphics card.
Thanks. We've seen this happen when the system prompt + first user message are longer than the context length.
Note that this is not related to this original github issue. Please create a new one here https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues and I'll help you debug.
Please share a screenshot of you model load parameters, specifically the context length.
Is it possible to run two versions of LM Studio in parallel? Otherwise I can't reproduce it, because I need a reliable version for my job.
I just reverted to 0.2.27 and it works fine again. Pity, really, because I would love to use large(r) context.
Talking of which: Is there any place one can download older versions? I had to dig through my backups to find the installer for 0.2.27, the website offers only the latest.
The context length is set to 4096 and the input is lower than 2.5k tokens, so that shouldn't be a problem.
@RiaGruen it would be really helpful if you create a separate issue and provide the info I mentioned above.
Reverting to an older version won't help us improve the software for everyone
@yagil I know :( Unfortunately I have a job that feeds my cats and thus takes precedence.
But do tell me, please, whether it is possible to run two versions of LM Studio in parallel? If I can do that, I can test as well as feed the cats.
My 2x tesla P40 GPU cannot be detected by version 0.3.2 but they work fine in version 0.2.18. In version 0.3.2, it shows the following result in "settings":
GPU (LM Runtime Dependent)
gguf runtime: CPU llama.cpp v1.1.7 { "result": { "code": "NoDevicesFound", "message": "No gpus found without acceleration backend compilation!" }, "gpuInfo": [] }
Same here, V 0.3.5
gguf runtime: CPU llama.cpp (Linux) v1.2.0 { "result": { "code": "NoDevicesFound", "message": "No gpus found without acceleration backend compilation!" }, "gpuInfo": [] }
The new version, app-0.2.31 doesn' seem to use (offload to= nvidia rtx 3060 anymore. The old one works flawlessly (app-0.2.29). My spec i7 10700 rtx 3060 phoneix 12 gb
The Gpu Offload checkbox is checked. Same setting as before, but run all on CPU and system RAM. I will be glad to supply further info to help fixing the problem.