vinid / safety-tuned-llamas

ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.
70 stars 9 forks source link

missing dataset #7

Closed pxyWaterMoon closed 5 months ago

pxyWaterMoon commented 5 months ago

Where can I access the Q-Harm and XSTest datasets mentioned in your paper? I couldn't locate them in the provided data file.

vinid commented 5 months ago

Just added Q-Harm! thank so much for pinging me about this!

For XSTest please refer to the repo from the exaggerated safety paper

pxyWaterMoon commented 5 months ago

ok! thx.