vlad17 mve issues - Githubissues

vlad17 / mve

MVE: model-based value estimation

Apache License 2.0

10 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add acrobot

#308 vlad17 closed 6 years ago
0
get rid of multi_step

#307 vlad17 opened 6 years ago
0
DDPG Timestep-Dependent Stopping condition

#306 alvinwan closed 6 years ago
0
Swimmer env

#305 alvinwan closed 6 years ago
3
Shorter period for parameter updates

#304 alvinwan closed 6 years ago
0
symmetric finite diffs step size for actor-gradient

#303 vlad17 closed 6 years ago
1
k-step temporal differences

#302 vlad17 closed 6 years ago
0
added numpy reward wrapper for envs

#301 vlad17 closed 6 years ago
0
Make gym 0.9.6 compat

#300 vlad17 opened 6 years ago
1
single gym version

#299 vlad17 closed 6 years ago
0
remove closed-loop dynamics monitoring and remove assoc dead code

#298 vlad17 closed 6 years ago
0
use learned dynamics in model value expansion

#297 vlad17 closed 6 years ago
0
Gym hotfix

#296 vlad17 closed 6 years ago
0
Added reward scaling

#295 alvinwan closed 6 years ago
11
Debug non-determinism

#294 vlad17 opened 6 years ago
2
one-step actor gradient model expansion

#293 vlad17 closed 6 years ago
0
Make envs

#292 vlad17 closed 6 years ago
4
add gym2 functionality

#291 vlad17 closed 6 years ago
0
No human rendering mode (for tests)

#290 vlad17 closed 6 years ago
0
always seed deterministically

#289 vlad17 closed 6 years ago
0
removed distributed TF cruft

#288 vlad17 closed 6 years ago
0
Move `discounted_rewards` + Additional Test

#287 alvinwan closed 6 years ago
1
Add poster

#286 vlad17 closed 6 years ago
0
Vectorize MJC

#285 vlad17 closed 6 years ago
1
added multithreaded env

#284 vlad17 closed 6 years ago
0
make flag that chooses which parallel environment to use in DDPG

#283 vlad17 closed 6 years ago
0
randomize ports

#282 vlad17 closed 6 years ago
0
faster between-graph replication for distributed TF

#281 vlad17 closed 6 years ago
1
ansync "processes" for data gathering, training, evaluation -- allow discrete # of steps for each

#280 vlad17 closed 6 years ago
0
between-graph replication for DDPG

#279 vlad17 closed 6 years ago
0
migrate experiment.py flags

#278 vlad17 closed 6 years ago
1
centralize DDPG class

#277 vlad17 closed 6 years ago
0
terminal Q bias fix in qvalues().

#276 vlad17 closed 6 years ago
1
on-policy oracle training

#275 vlad17 closed 6 years ago
0
move to distributed TF + pushdown actor

#274 vlad17 closed 6 years ago
0
remove mask and horizon from sample()

#273 vlad17 closed 6 years ago
0
auto-inline parallel venvs with 1 env.

#272 vlad17 closed 6 years ago
1
migrate multiprocessing venv into a cleaner version

#271 vlad17 closed 6 years ago
0
updated mjkey

#270 vlad17 closed 6 years ago
0
verbose by default, --quiet flag to opt-in to no messages

#269 vlad17 opened 6 years ago
0
faster multiprocessing env

#268 vlad17 closed 6 years ago
1
[WIP/dontmerge] additional training steps for the actor

#267 vlad17 closed 6 years ago
0
actually solve ddpg.py concurrency issue

#266 vlad17 closed 6 years ago
0
concurrent-ish ddpg

#265 vlad17 closed 6 years ago
0
separate actor/critic decays

#264 vlad17 closed 6 years ago
0
Added Minimum Buffer Size + Random Shooter with True Dynamics

#263 alvinwan closed 6 years ago
5
add a generic progress bar

#262 vlad17 closed 6 years ago
1
get rid of ddpg flags arguments

#261 vlad17 closed 6 years ago
1
model expansion for DDPG

#260 vlad17 closed 6 years ago
0
print a pure model-based Q estimate as well as the original Q

#259 vlad17 closed 6 years ago
0

Previous Next