coleam00 / bolt.new-any-llm

Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
https://bolt.new
MIT License
3.93k stars 1.61k forks source link

(Using qwen2.5-coder-extra-large:7b (Ollama)) After Docker Build: Frontend Stuck on Loading #231

Open darshanbuildx opened 1 week ago

darshanbuildx commented 1 week ago

Describe the bug

After cloning the repository and using the Canary browser, I rebuilt the Docker container and encountered the following issues:

The terminal/CMD shows repeated warnings and errors related to dependencies and optimization. The frontend on localhost:5173 is stuck on loading and does not generate any output, even after waiting for more than 10 minutes.

Link to the Bolt URL that caused the error

http://localhost:5173

Steps to reproduce

  1. Clone the repository.
  2. Build and run the Docker container using the provided Docker commands.
  3. Open the application in the Canary browser (localhost:5173).
  4. Observe the terminal/CMD output and frontend behavior.

Expected behavior

The app should load properly, generate responses, and function as intended without prolonged loading or errors.

Screen Recording / Screenshot

Capture error

Platform

OS:Windows 10 Browser: Canary Version: 132.0.6828.0

Additional context

This issue occurs despite several attempts, including rebuilding the Docker container and using different commands, the problem persists.

omarnahdi commented 1 week ago

@darshanbuildx Did you found the solution?

darshanbuildx commented 1 week ago

@omarnahdi nope no luck.

omarnahdi commented 1 week ago

@darshanbuildx what are your specs?

darshanbuildx commented 1 week ago

@omarnahdi Windows 10, Docker, Chrome Canary what else specs would you need to help?

Intiiii commented 1 week ago

is it possible that your pc just isn't fast enough? Btw with specs we mean your memory, processor etc.

darshanbuildx commented 1 week ago

thank you for responding @omarnahdi and @Intiiii here are specs:

I have 32gb RAM and 1TB SSD 16GB Intel R (UHD) Graphics 620 Graphics CPU Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz

Base speed: 1.90 GHz
Sockets:    1
Cores:  4
Logical processors: 8
Virtualization: Enabled
L1 cache:   256 KB
L2 cache:   1.0 MB
L3 cache:   6.0 MB

Utilization 100%
Speed   1.93 GHz
Up time 2:22:10:48
Processes   413
Threads 6063
Handles 203480
CPU specs
omarnahdi commented 1 week ago

@darshanbuildx You have pretty much the same setup as me, I think you should try some smaller model, does that qwen model run smoothly in the cmd terminal? if not then you should really shift to a smaller model mine runs as well but it is some sort of slow with bolt but it is lightning fast in the cmd and with open-webui as well. Don't know why it is slow with blot though, but I'll suggest try https://console.mistral.ai/api-keys/ as they are and has low restriction in free tier as well https://console.mistral.ai/usage/ or just use any model from github models as they all are OpenAI API compatible

radumalica commented 1 week ago

Guys, don't post issues here if you don't know what models to use capable to your hardware. For integrated graphics, 7b qwen extra large just kills your machine

I have a Ryzen 5 5600X with 64gb of memory, RTX 4070 12gb vram and the only model that is decent enough is qwen-coder-2.5-7b with q6 quantization. This runs on my setup with around 10 tokens/s as response time. Not really a match to official bolt or how chatgpt outputs the text fast, but it can be used.

It's not all about VRAM it's also about how many Tensor cores you have on your GPU. I tried 14b with q4 quantization and it's dead slow on my machine. (5tkn/s). And the actual CPU doesn't have at all Tensor cores, that's why all types of AIs must run on GPUs , stable diffusion, LLMs, etc.

omarnahdi commented 1 week ago

Guys, don't post issues here if you don't know what models to use capable to your hardware. For integrated graphics, 7b qwen extra large just kills your machine

I have a Ryzen 5 5600X with 64gb of memory, RTX 4070 12gb vram and the only model that is decent enough is qwen-coder-2.5-7b with q6 quantization. This runs on my setup with around 10 tokens/s as response time. Not really a match to official bolt or how chatgpt outputs the text fast, but it can be used.

It's not all about VRAM it's also about how many Tensor cores you have on your GPU. I tried 14b with q4 quantization and it's dead slow on my machine. (5tkn/s). And the actual CPU doesn't have at all Tensor cores, that's why all types of AIs must run on GPUs , stable diffusion, LLMs, etc.

@radumalica If that's the case then the deepseek-coder-2V is running flawlessly on my system with open-webui through ollama but not with bolt. Can you elaborate, how and why? It runs on my i9-13900H CPU. With 32GB of DDR5 ram.

AmariLuigi commented 4 days ago

@radumalica If that's the case then the [Qwenn 32B](qwen2.5-coder:32b-base-q3_K_S (Ollama)) is running flawlessly on my system with open-webui through ollama but not with bolt also :)