issues
search
jettjaniak
/
teren
Linking activation space features to model behavior
Apache License 2.0
0
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
create core experiment script
#29
jettjaniak
opened
4 months ago
0
core_experiment script, WIP
#28
jettjaniak
closed
4 months ago
0
Adding Lindsey note perturbations
#27
ChatrikMangat
closed
4 months ago
0
jupyter formating in pre-commit
#26
jettjaniak
closed
4 months ago
0
add OAI's SAEs to Lindsey's repro nb
#25
jettjaniak
closed
4 months ago
0
extend lindsey's to pert. at multiple scales & compute KL
#24
jettjaniak
closed
3 months ago
1
save SAEFeatureExamples++ to persistent storage
#23
jettjaniak
opened
4 months ago
1
make Lindsey's repro work with OAI SAEs
#22
jettjaniak
closed
4 months ago
0
pre-commit should check notebooks with black
#21
jettjaniak
closed
4 months ago
0
Add the functionality to specify in each perturbation if we perturb from activation or towards another point
#20
GiglemaAI
opened
4 months ago
1
Create a separate clean experiment which replicates Stepan's results e2e for at least naive random
#19
GiglemaAI
opened
4 months ago
0
Implement necessary perturbations
#18
GiglemaAI
closed
4 months ago
1
migrate new perturbations
#17
jettjaniak
closed
4 months ago
2
typing, separate fns/methods
#16
jettjaniak
closed
4 months ago
0
how to turn inactive features on?
#15
jettjaniak
opened
4 months ago
0
think how to abstract at what seq. pos. we perturb or calculate loss
#14
jettjaniak
opened
4 months ago
0
split `get_pert_loss` function s.t. we can access perturbed activations if we want to
#13
jettjaniak
closed
4 months ago
0
add method to perturbations that returns pert_by_feature_id
#12
jettjaniak
closed
4 months ago
0
make it harder to confuse feature_idx and feature_id
#11
jettjaniak
closed
4 months ago
0
Compute resid mean across the whole dataset
#10
ChatrikMangat
closed
4 months ago
0
figure out how to compute KL divergence effectively
#9
jettjaniak
closed
4 months ago
1
figure out how to compute "standard error of the mean"
#8
jettjaniak
closed
4 months ago
0
initial base experiment
#7
GiglemaAI
closed
4 months ago
0
perturb. revamp & notebook
#6
jettjaniak
closed
4 months ago
0
update nvidia drivers on VMs
#5
jettjaniak
closed
4 months ago
0
auto shutdown VMs
#4
jettjaniak
closed
4 months ago
1
figure out how to deal with private keys on VMs
#3
jettjaniak
closed
4 months ago
0
zotero
#2
jettjaniak
closed
4 months ago
0
make HF org
#1
jettjaniak
closed
4 months ago
0