finetune_deepspeed使用load_in_8bit后，出现 RuntimeError(f"expected there to be only one unique element in {items}")

您好，感谢提供了这么好项目。我学习到了很多Lora的技巧。今天我在调试finetune_deepspeed时，因为我的显存有限（2x3090）加载 llama13B会OOM。所以我修改了部分finetune_deepspeed.py的代码。具体为：

if args.use8bit:
    model = LlamaForCausalLM.from_pretrained(
        args.model_path,
        load_in_8bit=True,
        device_map=device_map,
    )
    model = prepare_model_for_int8_training(model)
else:
    model = LlamaForCausalLM.from_pretrained(
        args.model_path,
        load_in_8bit=False,
        torch_dtype=torch.float16,
        device_map=device_map,
    ).half()

同时把zero_config.json中 "fp16"的 enabled改为auto。量化后的确成功load，但是在train的时候发生了如下错误：

  File "/home/dlwork02/.conda/envs/vicuna/lib/python3.9/site-packages/deepspeed/runtime/utils.py", line 870, in get_only_unique_item
    raise RuntimeError(f"expected there to be only one unique element in {items}")
RuntimeError: expected there to be only one unique element in <generator object Init._convert_to_deepspeed_param.<locals>.all_gather_coalesced.<locals>.<genexpr> at 0x7f6f2dbc27b0>

Facico / Chinese-Vicuna

finetune_deepspeed使用load_in_8bit后，出现 RuntimeError(f"expected there to be only one unique element in {items}") #205