Open some-rando-rl opened 2 years ago
What should our rewards be?
How do we want to evaluate them?
Do we want to normalize rewards? What about advantage?
What should our rewards be?
How do we want to evaluate them?
Do we want to normalize rewards? What about advantage?