lmstudio-ai / lmstudio-bug-tracker

Bug tracking for the LM Studio desktop application
10 stars 3 forks source link

new version doesn't support rtx 3060 anymore #107

Open GiuseppeZani opened 3 months ago

GiuseppeZani commented 3 months ago

The new version, app-0.2.31 doesn' seem to use (offload to= nvidia rtx 3060 anymore. The old one works flawlessly (app-0.2.29). My spec i7 10700 rtx 3060 phoneix 12 gb

The Gpu Offload checkbox is checked. Same setting as before, but run all on CPU and system RAM. I will be glad to supply further info to help fixing the problem.

JWhiteUX commented 2 months ago

!following — I have an RTX 3060 and having model loading issues… Not sure if this is related

RiaGruen commented 2 months ago

Same here.

yagil commented 2 months ago

@RiaGruen the original issue mentions version 0.2.31 which is NOT the newest version. The newest version is 0.3.2 available in https://lmstudio.ai.

What is the issue you are encountering? Please share screenshots / logs. Thank you.

RiaGruen commented 2 months ago

Here's the log: [2024-09-01 16:20:16][INFO][LM STUDIO SERVER] Running chat completion on conversation with 1 messages. [2024-09-01 16:20:39][ERROR] Unknown exception during inferencing.. Error Data: n/a, Additional Data: n/a [2024-09-01 16:20:39][INFO][LM STUDIO SERVER] Client disconnected. Stopping generation..

I didn't encounter that with the previous version and also not with a different graphics card.

yagil commented 2 months ago

Thanks. We've seen this happen when the system prompt + first user message are longer than the context length.

Note that this is not related to this original github issue. Please create a new one here https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues and I'll help you debug.

Please share a screenshot of you model load parameters, specifically the context length.

RiaGruen commented 2 months ago

Is it possible to run two versions of LM Studio in parallel? Otherwise I can't reproduce it, because I need a reliable version for my job.

I just reverted to 0.2.27 and it works fine again. Pity, really, because I would love to use large(r) context.

Talking of which: Is there any place one can download older versions? I had to dig through my backups to find the installer for 0.2.27, the website offers only the latest.

The context length is set to 4096 and the input is lower than 2.5k tokens, so that shouldn't be a problem.

yagil commented 2 months ago

@RiaGruen it would be really helpful if you create a separate issue and provide the info I mentioned above.

Reverting to an older version won't help us improve the software for everyone

RiaGruen commented 2 months ago

@yagil I know :( Unfortunately I have a job that feeds my cats and thus takes precedence.

But do tell me, please, whether it is possible to run two versions of LM Studio in parallel? If I can do that, I can test as well as feed the cats.

lpy86786 commented 2 months ago

My 2x tesla P40 GPU cannot be detected by version 0.3.2 but they work fine in version 0.2.18. In version 0.3.2, it shows the following result in "settings":

GPU (LM Runtime Dependent)

gguf runtime: CPU llama.cpp v1.1.7 { "result": { "code": "NoDevicesFound", "message": "No gpus found without acceleration backend compilation!" }, "gpuInfo": [] }

renellis commented 1 week ago

Same here, V 0.3.5

gguf runtime: CPU llama.cpp (Linux) v1.2.0 { "result": { "code": "NoDevicesFound", "message": "No gpus found without acceleration backend compilation!" }, "gpuInfo": [] }