likenneth / honest_llama

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
MIT License
478 stars 37 forks source link

Upload ITI baked-in models to HuggingFace #41

Closed jujipotle closed 3 months ago

jujipotle commented 3 months ago

Made changes to edit_weight.py, push_hf.py, and record results in iti_replication_results.md and README.md. Uploaded ITI baked-in models to HuggingFace: https://huggingface.co/collections/jujipotle/inference-time-intervention-iti-models-66ca15448347e21e8af6772e