Open NamburiSrinath opened 1 week ago
Hi @patrickrchao and @eltociear,
Wonderful repo, thanks a lot!
I am wondering if the Judge System prompt for GPT is actually correct i.e Section E in the paper and/or code - https://github.com/patrickrchao/JailbreakingLLMs/blob/main/system_prompts.py#L50
The judge should have the goal/objective and response to do the rating. But am I missing something here?
P.S: I changed the prompt a bit but GPT4 is refusing to provide ratings. I marked this issue in JailBreakBench as well (https://github.com/JailbreakBench/jailbreakbench/issues/34)
Hi @patrickrchao and @eltociear,
Wonderful repo, thanks a lot!
I am wondering if the Judge System prompt for GPT is actually correct i.e Section E in the paper and/or code - https://github.com/patrickrchao/JailbreakingLLMs/blob/main/system_prompts.py#L50
The judge should have the goal/objective and response to do the rating. But am I missing something here?
P.S: I changed the prompt a bit but GPT4 is refusing to provide ratings. I marked this issue in JailBreakBench as well (https://github.com/JailbreakBench/jailbreakbench/issues/34)