pieces-app / support

16 stars 3 forks source link

pieces for desktop application crashes when used with Mistral 7B CPU #235

Open ksubramanyeshwara opened 4 months ago

ksubramanyeshwara commented 4 months ago

Software

Desktop Application, VS Code

Operating System / Platform

Linux

Your Pieces OS Version

9.0.5

Early Access Program

Kindly describe the bug and include as much detail as possible on what you were doing so we can reproduce the bug.

Pieces for desktop application crash when used with the Mistral 7B CPU.

It used to crash every time I used the Mistral 7B CPU whenever I had a Windows OS. After installing Ubuntu 24.04 pieces for desktop applications, it crashed again once. I have 24 GB of RAM.

tsavo-at-pieces commented 4 months ago

Hey there! Sorry to hear this. We're actually releasing a large update here soon that @brian-pieces is working on to make it more stable across CPU.

Quick question are/were you able to try the GPU Model and does it reproduce the same results?

Last thing that might be worth a try (which will help us narrow it down) would be trying the MSFT Phi-2 Model.

I'll leave @brian-pieces and @mark-at-pieces to share specifics of the next Local LLM Runtime Update. Which should mitigate this and also bring you the latest Phi-3, Llama 3, Gemma, and a Mistral Update.

tsavo-at-pieces commented 4 months ago

Also - thank you so much for reporting this @ksubramanyeshwara really appreciate you being an early adopter and helping us understand how this thing behaves in the wild 🙌

ksubramanyeshwara commented 4 months ago

@tsavo-at-pieces Thank you for the response. I didn't try the GPU model because I have an integrated GPU. Is an integrated GPU sufficient to run a GPU model? Already started downloading the MSFT Phi-2 Model. I will report it if I see any crashes.

I would love to try the newer model once it becomes available for Linux systems, and I hope live context will also be made available for Linux users with the upcoming update. Thank You.

ksubramanyeshwara commented 4 months ago

Screenshot from 2024-05-31 19-54-05

MSFT Phi-2 Model doesn't even understand the question, and it didn't complete the generation for the question that I asked.

ksubramanyeshwara commented 4 months ago

Screenshot from 2024-05-31 14-02-14

Screenshot from 2024-05-31 21-52-56

Here's another piece of information:. I have running updated version of OS and application both. When the application crashed, it showed I was using 9.0.3 OS version

@tsavo-at-pieces

ksubramanyeshwara commented 4 months ago

Hey there! Sorry to hear this. We're actually releasing a large update here soon that @brian-pieces is working on to make it more stable across CPU.

Quick question are/were you able to try the GPU Model and does it reproduce the same results?

Last thing that might be worth a try (which will help us narrow it down) would be trying the MSFT Phi-2 Model.

I'll leave @brian-pieces and @mark-at-pieces to share specifics of the next Local LLM Runtime Update. Which should mitigate this and also bring you the latest Phi-3, Llama 3, Gemma, and a Mistral Update.

Screenshot from 2024-06-01 11-15-50

As you can see Mistral 7B GPU won't work on my machine.

brian-pieces commented 4 months ago

@ksubramanyeshwara sorry to hear you're having these problems!

Like Tsavo said, we have a great update coming that will fix a lot of these problems, but I will try to address your issues in the meantime.

If you're able to run Phi-2 on CPU but not Mistral 7B then you're likely running out of RAM. Fitting a 7B model into RAM is pretty tough, and, even if it fits, the output will be fairly slow.

You're correct that you'll be unable to run our GPU models with an integrated GPU with our current runtime.

Phi-2 is likely behaving poorly because of too much context and a limited context window size for the model. We've had to limit the context window size for our CPU models with our current runtime in order to keep RAM usage reasonable, but this can result in decreased ability to answer questions about earlier context as well as outputs being cut off.

Now, as for how our new update will help, unfortunately without a dedicated GPU and with 24GB of RAM you'll still be limited to smaller LLMs. That being said, we're planning on releasing a couple of new small models that blow Phi-2 out of the water, so they should work great for you!

We're also going to increase the context sizes for our CPU models which should alleviate the problems mentioned with Phi-2 CPU. We'll still need to be careful to ensure we don't overload the CPU because of too much context.

Until we have these updates rolled out, if you're on an integrated GPU machine then I would recommend using our cloud LLMs to have the highest quality experience with Pieces.

Thank you for reporting the problems and being patient with us! Supporting on-device LLMs is important to us, so your feedback is very helpful. We're working hard on the new update and will hopefully have it out soon!

Please let me know if you have any other questions or clarifications!

ramarivera commented 4 months ago

Just to add one more case, I am using Mistral GPU model and I experience the same issues on the desktop app :(

av commented 3 months ago

At least Mistral/Phi-2 GPU models do not work on Linux with installation from snap (didn't try the other ones).

image

I tried to debug:

Since this project supports OpenAI LLMs, it'd be greatly appreciated if we could use an OpenAI compatible API instead of a specific runtime that is bundled within the app. That'd also allow to compensate for a slightly dated selection of the models or non-reusable downloads clugging the disk.