issues
search
vlad17
/
mve
MVE: model-based value estimation
Apache License 2.0
10
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add acrobot
#308
vlad17
closed
6 years ago
0
get rid of multi_step
#307
vlad17
opened
6 years ago
0
DDPG Timestep-Dependent Stopping condition
#306
alvinwan
closed
6 years ago
0
Swimmer env
#305
alvinwan
closed
6 years ago
3
Shorter period for parameter updates
#304
alvinwan
closed
6 years ago
0
symmetric finite diffs step size for actor-gradient
#303
vlad17
closed
6 years ago
1
k-step temporal differences
#302
vlad17
closed
6 years ago
0
added numpy reward wrapper for envs
#301
vlad17
closed
6 years ago
0
Make gym 0.9.6 compat
#300
vlad17
opened
6 years ago
1
single gym version
#299
vlad17
closed
6 years ago
0
remove closed-loop dynamics monitoring and remove assoc dead code
#298
vlad17
closed
6 years ago
0
use learned dynamics in model value expansion
#297
vlad17
closed
6 years ago
0
Gym hotfix
#296
vlad17
closed
6 years ago
0
Added reward scaling
#295
alvinwan
closed
6 years ago
11
Debug non-determinism
#294
vlad17
opened
6 years ago
2
one-step actor gradient model expansion
#293
vlad17
closed
6 years ago
0
Make envs
#292
vlad17
closed
6 years ago
4
add gym2 functionality
#291
vlad17
closed
6 years ago
0
No human rendering mode (for tests)
#290
vlad17
closed
6 years ago
0
always seed deterministically
#289
vlad17
closed
6 years ago
0
removed distributed TF cruft
#288
vlad17
closed
6 years ago
0
Move `discounted_rewards` + Additional Test
#287
alvinwan
closed
6 years ago
1
Add poster
#286
vlad17
closed
6 years ago
0
Vectorize MJC
#285
vlad17
closed
6 years ago
1
added multithreaded env
#284
vlad17
closed
6 years ago
0
make flag that chooses which parallel environment to use in DDPG
#283
vlad17
closed
6 years ago
0
randomize ports
#282
vlad17
closed
6 years ago
0
faster between-graph replication for distributed TF
#281
vlad17
closed
6 years ago
1
ansync "processes" for data gathering, training, evaluation -- allow discrete # of steps for each
#280
vlad17
closed
6 years ago
0
between-graph replication for DDPG
#279
vlad17
closed
6 years ago
0
migrate experiment.py flags
#278
vlad17
closed
6 years ago
1
centralize DDPG class
#277
vlad17
closed
6 years ago
0
terminal Q bias fix in qvalues().
#276
vlad17
closed
6 years ago
1
on-policy oracle training
#275
vlad17
closed
6 years ago
0
move to distributed TF + pushdown actor
#274
vlad17
closed
6 years ago
0
remove mask and horizon from sample()
#273
vlad17
closed
6 years ago
0
auto-inline parallel venvs with 1 env.
#272
vlad17
closed
6 years ago
1
migrate multiprocessing venv into a cleaner version
#271
vlad17
closed
6 years ago
0
updated mjkey
#270
vlad17
closed
6 years ago
0
verbose by default, --quiet flag to opt-in to no messages
#269
vlad17
opened
6 years ago
0
faster multiprocessing env
#268
vlad17
closed
6 years ago
1
[WIP/dontmerge] additional training steps for the actor
#267
vlad17
closed
6 years ago
0
actually solve ddpg.py concurrency issue
#266
vlad17
closed
6 years ago
0
concurrent-ish ddpg
#265
vlad17
closed
6 years ago
0
separate actor/critic decays
#264
vlad17
closed
6 years ago
0
Added Minimum Buffer Size + Random Shooter with True Dynamics
#263
alvinwan
closed
6 years ago
5
add a generic progress bar
#262
vlad17
closed
6 years ago
1
get rid of ddpg flags arguments
#261
vlad17
closed
6 years ago
1
model expansion for DDPG
#260
vlad17
closed
6 years ago
0
print a pure model-based Q estimate as well as the original Q
#259
vlad17
closed
6 years ago
0
Previous
Next