Open lihua8848 opened 2 weeks ago
感谢反馈,我们正在评估影响
你好,这确实是一个 mistake,感谢反馈,为了保证训练和推理的一致性,我们不直接修改 hf 上的代码了,我们会在后续的模型发布中系统性地修复这个问题
你好,这确实是一个 mistake,感谢反馈,为了保证训练和推理的一致性,我们不直接修改 hf 上的代码了,我们会在后续的模型发布中系统性地修复这个问题 @YuzaChongyi Can you fully assess the impact? We are already fine-tuning the model and applying it to production. Or, when the next model will be released?
你好,这确实是一个 mistake,感谢反馈,为了保证训练和推理的一致性,我们不直接修改 hf 上的代码了,我们会在后续的模型发布中系统性地修复这个问题 @YuzaChongyi Can you fully assess the impact? We are already fine-tuning the model and applying it to production. Or, when the next model will be released?
There is no problem if the behavior of patch_attn_mask is consistent during the training and inference. We also try to modify it directly, which basically does not change the inference results. This version will not be updated to keep the evaluation results reproducible.
The release date of the next model is not certain yet,we are working for it.
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
patch_attn_mask计算出现问题,索引出错,导致patch_attn_mask全为true
上图的i=4时,有17个padding,应当最后17个为False,但patch_attn_mask最后的结果全为True
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/blob/main/modeling_minicpmv.py#L97
期望行为 | Expected Behavior
修改为
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response