Closed serfan closed 1 year ago
+1 10G的3080也报错了
在chatglm2的官方github上面找到了解决方案,有遇到跟我一样的情况可以尝试一下。 修改wenda\llms\llm_glm6b.py,找到
if "chatglm2" in settings.llm.path:
model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, device=device, revision="v1.1.0")
else:
model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, revision="v1.1.0")
修改为:
if "chatglm2" in settings.llm.path:
model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True,revision="v1.1.0").cuda()
else:
model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, revision="v1.1.0")
然后保存代码,重新启动chatglm就可以正常加载模型了。
参考地址 https://github.com/THUDM/ChatGLM2-6B/issues/52#issuecomment-1608625913
在chatglm2的官方github上面找到了解决方案,有遇到跟我一样的情况可以尝试一下。 修改wenda\llms\llm_glm6b.py,找到
if "chatglm2" in settings.llm.path: model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, device=device, revision="v1.1.0") else: model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, revision="v1.1.0")
修改为:
if "chatglm2" in settings.llm.path: model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True,revision="v1.1.0").cuda() else: model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, revision="v1.1.0")
然后保存代码,重新启动chatglm就可以正常加载模型了。
提交个pr嘛
已提交pr
老版本chatglm-6b-int4没问题,可以正常跑,改用chatglm2-6b-int4之后,一直报错CUDA out of memory,更新了最新的闻达也不行,报错信息如下,笔记本3060显卡只有6G显存,难道没机会体验chatglm2了?:( Exception in thread Thread-1 (load_model): Traceback (most recent call last): File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\threading.py", line 1038, in _bootstrap_inner self.run() File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\threading.py", line 975, in run self._target(*self._args, self._kwargs) File "E:\ai\wenda\wenda\wenda.py", line 51, in load_model LLM.load_model() File "E:\ai\wenda\wenda\llms\llm_glm6b.py", line 71, in load_model model = AutoModel.from_pretrained(settings.llm.path, local_files_only=True, trust_remote_code=True, device=device, revision="v1.1.0") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\transformers\models\auto\auto_factory.py", line 479, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\transformers\modeling_utils.py", line 2675, in from_pretrained model = cls(config, model_args, model_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\serfa/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\modeling_chatglm.py", line 767, in init self.transformer = ChatGLMModel(config, empty_init=empty_init, device=device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\serfa/.cache\huggingface\modules\transformers_modules\chatglm2-6b-int4\modeling_chatglm.py", line 700, in init self.encoder = init_method(GLMTransformer, config, init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\utils\init.py", line 52, in skip_init return module_cls(args, kwargs).to_empty(device=final_device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 1024, in to_empty return self._apply(lambda t: torch.empty_like(t, device=device)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) [Previous line repeated 1 more time] File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 820, in _apply param_applied = fn(param) ^^^^^^^^^ File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch\nn\modules\module.py", line 1024, in
return self._apply(lambda t: torch.empty_like(t, device=device))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\ai\wenda\WPy64-31110\python-3.11.1.amd64\Lib\site-packages\torch_refs__init__.py", line 4254, in empty_like
return torch.empty_strided(
^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 108.00 MiB (GPU 0; 6.00 GiB total capacity; 5.34 GiB already allocated; 0 bytes free; 5.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF