qfeuilla TuringMirror issues - Githubissues

qfeuilla / TuringMirror

A benchmark to test whether AI are able to recognize there own output from human or other AI

2 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Get feedback and input from mech-interp gurus

#11 jas-ho opened 1 year ago
0
Sanity check generation of gpt-3.5 fables

#10 jas-ho opened 1 year ago
0
Human performance baseline

#9 jas-ho opened 1 year ago
0
Data analysis of generated datasets

#8 jas-ho opened 1 year ago
0
Sanity checks

#7 jas-ho opened 1 year ago
0
Test correlation of attribution with expressed preference

#6 jas-ho opened 1 year ago
1
Few-shot baseline for the binary-comparison prompt

#5 jas-ho opened 1 year ago
0
Try Python function prompt

#4 jas-ho opened 1 year ago
0
Evaluate tests involving base model generations

#3 jas-ho opened 1 year ago
0
Scout a complementary dataset from different linguistic domain

#2 jas-ho opened 1 year ago
0
Extend fables dataset to generations from base models

#1 jas-ho opened 1 year ago
0