andyzoujm representation-engineering issues

andyzoujm / representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency

https://www.ai-transparency.org/

MIT License

713 stars 86 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Missing _use_flash_attention_2 in Llama Model with ContrastVecLlamaForCausalLM

#56 slfg opened 1 week ago
0
kwargs missed during generation

#55 HenryCai11 opened 2 months ago
0
How you evaluate the in-domain generalization of the honesty probe?

#54 wenjunli-0 opened 2 months ago
1
gemma-2 support / sdpa NaN error

#53 justinwangx opened 2 months ago
0
v0.1.4 release

#52 justinwangx closed 2 months ago
0
feat: bfloat16 support for rep reading

#51 justinwangx closed 2 months ago
0
What are the control_method available? only "reading_vec"?

#50 wenjunli-0 opened 2 months ago
0
LOSS does not need original, LOSS calibrate the difference between +/- ??

#49 YerongLi closed 2 months ago
1
v0.1.3 release

#48 justinwangx closed 3 months ago
0
Padding and only align CoT side

#47 YerongLi closed 3 months ago
0
What is the parameter to reproduce result on TQA

#46 YerongLi closed 2 months ago
1
Train large model on multiple GPUs. You can't train a model that has been loaded with `device_map='auto'` in any distributed mode.

#45 YerongLi closed 3 months ago
2
neg/pos prompt length never changes : Is this expected

#44 YerongLi closed 3 months ago
2
Do we need to optimize over the constrastive vector loss

#43 YerongLi closed 3 months ago
2
update pyproject.toml

#42 justinwangx closed 3 months ago
0
add pyproject.toml

#41 justinwangx closed 3 months ago
0
Questions when using honesty.ipynb on llama-7b-chat-hf

#40 yuxuanfanOrion closed 3 months ago
2
How to make Repe support image as input if using llama3 as base model?

#39 poa010101 opened 5 months ago
0
Need help with honesty benchmark against vanilla llama2

#38 poa010101 opened 5 months ago
0
How to use RAG in Repe?

#37 poa010101 closed 5 months ago
3
Add projection operation

#36 adamkarvonen opened 7 months ago
1
Understandung the contrast vector implementation

#35 Luca-vdB opened 8 months ago
1
CLIP Examples for Emotion Classification

#34 danielz02 opened 8 months ago
0
[WIP] Self concept exemplar

#33 AetherPrior closed 8 months ago
1
Performance of Harmfulness exp is too high ?

#32 shaoyangxu opened 9 months ago
1
Condense similar code involving self_attn, mlp, input_layernorm, post_attention_layernorm

#31 mxl1n closed 10 months ago
0
Performance Enhancement

#30 poa010101 opened 10 months ago
1
Performance on Open LLM Leaderboard

#29 chenweixin107 closed 10 months ago
1
Contrast Vector - Add Code for Generation?

#28 RobertMcCarthy97 opened 11 months ago
11
Question about the honesty scores calculation

#27 Jeffwang87 opened 11 months ago
1
How to select the layers used for control?

#26 Angelo3357 opened 11 months ago
2
Unexpect behavior from honesty example notebook

#25 zxchen98 closed 11 months ago
3
How to automate the [threshold] parameter in Honest example?

#24 poa010101 closed 9 months ago
2
Question about max and min function in LAT reading

#23 Jeffwang87 closed 11 months ago
5
Enhancing RepControl by introducing the pca_model's `explained_variance_ratio_`

#22 semicircle opened 11 months ago
2
Question about emotion_funtion

#21 sorcererrandy closed 11 months ago
2
Question about customize pipeline in code

#20 Jeffwang87 opened 11 months ago
1
Dataset in example honest notebook

#19 Jeffwang87 closed 11 months ago
3
About Harmlessness concept and controlling

#18 Alan-Qin closed 11 months ago
7
n_difference parameter with clustermean?

#17 vthost closed 11 months ago
1
Dataset in example honest notebook

#16 Zijian007 opened 12 months ago
13
Accelerate rep-reading using GPU

#15 Y-L-LIU closed 11 months ago
4
Accelerate the rep-reading

#14 Y-L-LIU opened 1 year ago
7
I have some questions and ask the author for help

#13 Chenjingxue closed 12 months ago
1
Assert error "NaN in output logprobs"

#12 chenlidar closed 1 year ago
3
Wizard-Vicuna-30B-Uncensored

#11 caesar-jojo closed 1 year ago
2
Question about layer_id

#10 sorcererrandy closed 1 year ago
7
Multi-turn conversation

#9 sorcererrandy closed 1 year ago
0
How to prompt Llama2-13b-chat to generate false answers?

#8 chenweixin107 closed 1 year ago
1
Random comments ignore

#7 wassname closed 1 year ago
1