issues
search
LLM-Tuning-Safety
/
LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
https://llm-tuning-safety.github.io/
MIT License
245
stars
29
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
A survey on a line of work following (Qi. et al. 2023)
#8
huangtiansheng
opened
1 month ago
0
temp not zero during inference
#7
ShengYun-Peng
closed
6 months ago
2
Error because of `all_reduce` on `float` instead of `torch.Tensor`
#6
ain-soph
closed
3 months ago
1
Quantized model training of llama gives error
#5
lihkinVerma
closed
1 year ago
1
How the pure_bad_dataset was created??
#4
lihkinVerma
closed
1 year ago
1
SafeTensors issue
#3
lihkinVerma
closed
1 year ago
1
How about the response quality beyond the finetune domain
#2
wqw547243068
closed
1 year ago
1
Fix typo in train_utils.py
#1
eltociear
opened
1 year ago
0