以lora、bfloat16方式微调模型，模型微调后采用lora参数和基座模型进行推理，使用merge_and_unload()类前后推理结果不一致，为什么会出现这种情况呢

shaojh1 commented 6 months ago

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

[X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

model = AutoModelForCausalLM.from_pretrained( model_path, trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto")

model = PeftModel.from_pretrained( model, adapter_path, torch_dtype=torch.bfloat16, device_map="auto")

model = model.merge_and_unload() model.bfloat16()

期望行为 | Expected Behavior

按理说使用merge_and_unload类前后模型推理结果应该一致

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:Ubuntu 18.04.6
- Python:3.10
- Transformers:4.32.0
- PyTorch:2.0.1
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):11.4

备注 | Anything else?

No response

jklj077 commented 6 months ago

For accurate comparison of results between the two models, please adhere to these guidelines:

Ensure that both models have the do_sample hyperparameter set to False in generation. This will guarantee that the models generate outputs deterministically rather than randomly sampling possible sequences.
Be aware that minor discrepancies may exist in the output results because of inherent variations in floating-point arithmetic operations. Since the computational process diverges before and after adapter merging, this can lead to subtle differences in the final outcomes despite identical inputs.

If you have encountered substantial differences, steps and inference examples to reproduce the problem is welcomed.

tyh4521 commented 5 months ago

我也遇到了相同的问题

微调后使用adapter推理是符合预期的 adapter通过merge_and_unload后生成的模型推理却失去的微调的效果

两种模式都已经设置do_sample=False,num_beams=1

github-actions[bot] commented 4 months ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决，请在此帖下方留言以补充信息。

ambyerhan commented 1 month ago

同样的问题，并且device-map=auto/cpu结果也不一样～

QwenLM / Qwen