2024-06-19 08:57:48 - accelerate.utils.modeling.modeling.py - INFO - 1008 - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 8/8 [00:03<00:00, 2.28it/s]
2024-06-19 08:57:52 - root.big_modeling.py - WARNING - 435 - Some parameters are on the meta device device because they were offloaded to the cpu.
2024-06-19 08:57:52 - accelerate.big_modeling.big_modeling.py - WARNING - 452 - You shouldn't move a model that is dispatched using accelerate hooks.
Traceback (most recent call last):
File "F:\Soft\TestHHH.py", line 13, in <module>
model = LocalLLMModel(
File "F:\CondaSpace\envs\tymodel\lib\site-packages\pylmkit\llms\_huggingface_llm.py", line 40, in __init__
self.model = self.model.half().cuda()
File "F:\CondaSpace\envs\tymodel\lib\site-packages\accelerate\big_modeling.py", line 455, in wrapper
raise RuntimeError("You can't move a model that has some modules offloaded to cpu or disk.")
RuntimeError: You can't move a model that has some modules offloaded to cpu or disk.
测试环境 win10 RTX 3080 cuda 12.1 cudnn 8.9.6 python 3.10 torch 2.3.1+cu121 tensorflow 2.16.1
测试代码
报错信息
根据报错的信息提到的两个py文件 好像是跟能否调用cuda有关 但实际上是可以调用的
期望结果 他这个报错应该如何修改 才可以正常运行(在linux中是可以正常跑的)