Add model.half() when model load

intel-analytics / text-generation-webui

A Gradio Web UI for running local LLM on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) using IPEX-LLM.

GNU Affero General Public License v3.0

14 stars 8 forks source link

Closed hkvision closed 5 months ago

hkvision commented 5 months ago

@sgwhat To save some memory?

sgwhat commented 5 months ago

sure.