AoiDragon / HADES

[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models''
MIT License
21 stars 3 forks source link

LOSS: NAN #9

Open Eurek001 opened 1 month ago

Eurek001 commented 1 month ago

Hi, I'm excited about your work and I'm trying white_box_attack and finding loss=nan and the labels are all -100, I'm wondering if this is normal. Looking forward to your reply.

pyogher commented 4 days ago

Hi @Eurek001 , the loss and labels you mentioned is abnormal. In our white box setup, we set the labels for content outside of white_box/affirmative_responses.txt to -100. You should check if the labels corresponding to the tokens of affirmative_responses are normal.