Open Eurek001 opened 1 month ago
Hi @Eurek001 , the loss and labels you mentioned is abnormal. In our white box setup, we set the labels for content outside of white_box/affirmative_responses.txt to -100. You should check if the labels corresponding to the tokens of affirmative_responses are normal.
Hi, I'm excited about your work and I'm trying white_box_attack and finding loss=nan and the labels are all -100, I'm wondering if this is normal. Looking forward to your reply.