yangjianxin1 / Firefly

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
5.92k stars 530 forks source link

为什么要在配置完又重新把模型dtype设置为fp32 #204

Open gongye19 opened 10 months ago

gongye19 commented 10 months ago
    model = AutoModelForCausalLM.from_pretrained(
    args.model_name_or_path,
    device_map=device_map,
    load_in_4bit=True,
    torch_dtype=torch.float16,
    trust_remote_code=True,
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.float16,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        llm_int8_threshold=6.0,
        llm_int8_has_fp16_weight=False,
    ),
)

......

model = get_peft_model(model, config)
model.print_trainable_parameters()
model.config.torch_dtype = torch.float32
yangjianxin1 commented 10 months ago

可忽略该操作,对训练不会产生实质影响