kevinyaobytedance / llm_unlearn

LLM Unlearning
MIT License
99 stars 13 forks source link

About the badloss nan and the release of eval code. #2

Open yongliang-wu opened 8 months ago

yongliang-wu commented 8 months ago

Refined version:


Hello,

Thank you for sharing the code. I attempted to execute the unlearning code on the opt-1.3b model. Unfortunately, after the first batch, I encountered a 'nan' loss. Given the constraints of my hardware (a single RTX 3090), I configured the batch size to 1 and set the precision to fp16. Here's a snapshot of the log:

batch: 0, bad_loss: 1.68, current_div_loss: inf, 
batch: 1 and onwards, bad_loss: nan, current_div_loss: nan, 

Additionally, I'd like to inquire if there are plans to release the evaluation code.

Thank you for your assistance!


CurryxIaoHu commented 1 month ago

Did you solve this problem?

yongliang-wu commented 3 weeks ago

Not yet. I give up.