oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
40.26k stars 5.28k forks source link

Crazy high GPU utilization in the *browser* when training #2114

Closed UrsaMaximus closed 1 year ago

UrsaMaximus commented 1 year ago

Describe the bug

This is an odd one but I think worth taking a look at.

I've noticed that when doing LoRa training in the web UI, my browser has incredibly high GPU utilization. This issue persists across my Macbook and Windows PC. It happens with CPU, CUDA, and MPS backends. It happens in the latest Firefox and Chrome browsers, though it's notably worse in Firefox.

When idle, or even when generating text, the text gen web ui does not consume an unusual amount of resources: CPU, GPU, or memory. It just looks like a typical webpage. All major resource consumption can be seen attributed to its python backend. This is what I would expect to see!

However, during lora training, according to Task Manager (Windows) or Activity Monitor (macOS), my browser will chew up an astonishing 30-50% of my system's GPU resources. I suspect something in the status update code is unintentionally hammering the page and causing it to redo layout continuously, but I can't say for sure.

Fortunately, there's a fairly easy workaround: Opening a new blank tab and leaving the browser there cuts usage back down to ~1%, where it belongs. I can peek back at the Text Gen Web UI tab from time to time to look at training status, and it does seem to stay connected and display an updated progress bar.

Is there an existing issue for this?

Reproduction

Nothing special is needed for reproduction as far as I can tell. Attempt any lora training and check browser resource usage. I am attempting an Alpaca-like json formatted training on the Llama-7b model, loaded in 8 bit but I doubt it matters. The issue seems to be in the front end and is agnostic of pretty much anything.

Screenshot

No response

Logs

No errors, just excessive browser resource usage

System Info

== macOS Machine ==
Model Name: MacBook Pro
Model Identifier: MacBookPro18,2
Chip: Apple M1 Max
Total Number of Cores (CPU): 10 (8 performance and 2 efficiency)
Total Number of Cores (GPU): 32
Memory: 64 GB
System Version: macOS 13.3.1 (22E261)
Firefox: 113.0.1 (64-bit)
Chrome: 113.0.5672.126

== Windows Machine ==
OS Name: Microsoft Windows 10 Pro
Version: 10.0.19045 Build 19045
Processor: Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz, 3501 Mhz, 6 Core(s), 12 Logical Processor(s)
RAM: 16.0 GB
GPU: NVIDIA GeForce RTX 3080 Ti
VRAM: 12 GB
Firefox: 113.0.1 (64-bit)
Chrome: 113.0.5672.93 (Official Build) (64-bit)
Ph0rk0z commented 1 year ago

Hmm.. seems like this is an issue from the browser updating the training params constantly. Mine didn't have this issue in librewolf but it also disconnected on my 5 hours into training.

sbakhospaystone commented 1 year ago

Also happens with Firefox on Linux

sean-kang commented 1 year ago

I see the same problem on text generation. If WebUI tab is active on FireFox browser, GPU usage of the web browser spikes to 90%-100%. It actually keeps the text generation from working. I have to change the tab to empty one to make it work.

Running a lighter 7B 4-bit model on GPTQ seems okay maybe because it runs fast on my PC. But the problem becomes more prominent if I run heavier 13B 4-bit model on llama.cpp.

FireFox version 114.0.1 on Windows 10. WebUI runs on Ubuntu WSL2.

Ph0rk0z commented 1 year ago

Sounds like gradio things, tbh.

vrmuza commented 1 year ago

It's the flashing/glowing orange bar thing at the bottom. If that's on screen the training hangs or the start of generation will be slow in chat mode too. I have to hide those when I use textgen ui

Edit: in chat mode it's the orange bar at the top

Can do: -Scroll up -Block the orange bar with ublock element hider -Edit the css index-4e8912f3.css in inspector style editor or the .css files themselves to remove stuff (search for "keyframe" and remove the stuff contained in that set ofbrackets)

delete:

@keyframes svelte-j1gjts-pulse {
 0%,
 to {
  opacity:1
 }
 50% {
  opacity:.5
 }
}

in the firefox inspector or installer_files\env\Lib\site-packages\gradio\templates\frontend\assets installer_files\env\Lib\site-packages\gradio\templates\frontend\static one of those idk which i just did both and cleared firefox cache+reload webui

-Can set monitor refresh rate lower too I think and it helps but then you're stuck with 60hz so that's a no go

github-actions[bot] commented 1 year ago

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

cg6 commented 1 year ago

Comment out the animation line in main.css. I'm using a 144Hz monitor - might be relevant to GPU use.

.typing span {
  content: '';
  /* animation: blink 1.5s infinite; */
  animation-fill-mode: both;
  height: 10px;
  width: 10px;
  background: #3b5998;;
  position: absolute;
  left:0;
  top:0;
  border-radius: 50%;
}