LLM-Tuning-Safety LLMs-Finetuning-Safety issues

LLM-Tuning-Safety / LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

https://llm-tuning-safety.github.io/

MIT License

245 stars 29 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

A survey on a line of work following (Qi. et al. 2023)

#8 huangtiansheng opened 1 month ago
0
temp not zero during inference

#7 ShengYun-Peng closed 6 months ago
2
Error because of `all_reduce` on `float` instead of `torch.Tensor`

#6 ain-soph closed 3 months ago
1
Quantized model training of llama gives error

#5 lihkinVerma closed 1 year ago
1
How the pure_bad_dataset was created??

#4 lihkinVerma closed 1 year ago
1
SafeTensors issue

#3 lihkinVerma closed 1 year ago
1
How about the response quality beyond the finetune domain

#2 wqw547243068 closed 1 year ago
1
Fix typo in train_utils.py

#1 eltociear opened 1 year ago
0