coleam00 / bolt.new-any-llm

Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
https://bolt.new
MIT License
3.29k stars 1.38k forks source link

High CPU and RAM Usage with Bolt.New #295

Open ahmedashraf443 opened 2 days ago

ahmedashraf443 commented 2 days ago

Describe the bug

When using language models that fit within my GPU (e.g., through Aider or OpenWebUI), they run smoothly, utilizing only the GPU and delivering optimal performance. However, when attempting to use the same models through Bolt.New, I encounter significant performance issues:

as soon as i send anything through bolt.new the ollama server spikes in both ram and CPU usage its as if im running a model that is a lot bigger not the same small model i am running.

Link to the Bolt URL that caused the error

I used pnpm run dev to test it out before deploying

Steps to reproduce

Load a language model that fits within the GPU using Bolt.New. Observe CPU and RAM usage in the system monitor. Note the token generation speed compared to other platforms.

Expected behavior

The model should run smoothly, leveraging the GPU for computation, similar to its performance in other environments like Aider or OpenWebUI.

Screen Recording / Screenshot

No response

Platform

Additional context

No response

zjjt commented 19 hours ago

i second this too. right now i am 1hour into generating an app with phi3.5 model on my M1 mac 8gb ram. this is the only model which responded fast and could actually execute and preview code. the others only behaved like chat-gpt offering me solutions that i could implement myself....but 1h into it and still going it hasnt finished generating the package.json file... Is there a way to smooth this out ?