issues
search
qfeuilla
/
TuringMirror
A benchmark to test whether AI are able to recognize there own output from human or other AI
2
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Get feedback and input from mech-interp gurus
#11
jas-ho
opened
1 year ago
0
Sanity check generation of gpt-3.5 fables
#10
jas-ho
opened
1 year ago
0
Human performance baseline
#9
jas-ho
opened
1 year ago
0
Data analysis of generated datasets
#8
jas-ho
opened
1 year ago
0
Sanity checks
#7
jas-ho
opened
1 year ago
0
Test correlation of attribution with expressed preference
#6
jas-ho
opened
1 year ago
1
Few-shot baseline for the binary-comparison prompt
#5
jas-ho
opened
1 year ago
0
Try Python function prompt
#4
jas-ho
opened
1 year ago
0
Evaluate tests involving base model generations
#3
jas-ho
opened
1 year ago
0
Scout a complementary dataset from different linguistic domain
#2
jas-ho
opened
1 year ago
0
Extend fables dataset to generations from base models
#1
jas-ho
opened
1 year ago
0