EleutherAI / project-menu

See the issue board for the current status of active and prospective projects!
65 stars 4 forks source link

[Idea] Are big LMs mesaoptimizing? #23

Closed leogao2 closed 1 year ago

leogao2 commented 3 years ago

Motivation

Mesaoptimization in big LMs would be kind of concerning, and would make any LM+RL really unsafe.

Main problems to figure out are:

I don't have satisfying answers for these yet. _

Hypothesis/Conjecture

Big LMs might be mesaoptimizing -- seems plausible given how LMs can model agenty things. _

Proposed Experiments(Or series of Experiments)

_

Let know what you people think about the hypothesis and design of experiments, in the comments below! Also, feel free to propose new/better experiments.