trestad / mitigating-reversal-curse

Code for paper 'Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse'
11 stars 0 forks source link

loss is 0 and eval metrics is 0 #1

Open lzc-nazarite opened 5 months ago

lzc-nazarite commented 5 months ago

hello, i reproduce your mlm_run.py code and encounter some problems. -- 1. the loss appears 0 in the very early stage of training (actually in the epoch1) -- 2. during the evaluation, the metrics of trained mlm model are 0.

image image

can you help me explain this phenomenon?

trestad commented 5 months ago

Apologies for the delayed reply. Which specific model are you training? What is the dtype? I haven't come across this particular issue before. You can contact me via email at anglv@ruc.edu.cn, which might be a more convenient way for discussion.