About the badloss nan and the release of eval code.

yongliang-wu commented 1 year ago

Refined version:

Hello,

Thank you for sharing the code. I attempted to execute the unlearning code on the opt-1.3b model. Unfortunately, after the first batch, I encountered a 'nan' loss. Given the constraints of my hardware (a single RTX 3090), I configured the batch size to 1 and set the precision to fp16. Here's a snapshot of the log:

batch: 0, bad_loss: 1.68, current_div_loss: inf, 
batch: 1 and onwards, bad_loss: nan, current_div_loss: nan,

Additionally, I'd like to inquire if there are plans to release the evaluation code.

Thank you for your assistance!

CurryxIaoHu commented 5 months ago

Did you solve this problem?

yongliang-wu commented 5 months ago

Not yet. I give up.

Xiang-Pan commented 2 months ago

It seems the author does not care about the replicability of the paper.

kevinyaobytedance / llm_unlearn

About the badloss nan and the release of eval code. #2