thiagopbueno model-aware-policy-optimization issues

thiagopbueno / model-aware-policy-optimization

MAPO: Model-Aware Policy Optimization algorithm

GNU General Public License v3.0

1 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Feature/navigation 1d

#101 thiagopbueno closed 5 years ago
0
feat: use target models in mapdg gradient for dynamics loss

#100 0xangelo closed 5 years ago
0
Add a DynamicsSubnets model class

#99 thiagopbueno opened 5 years ago
0
Add a Gaussian dynamics model base class

#98 thiagopbueno opened 5 years ago
0
feat: add histograms and clear terminal output

#97 0xangelo closed 5 years ago
0
Feature/params histogram

#96 0xangelo closed 5 years ago
0
ci: parallelize tests

#95 0xangelo closed 5 years ago
0
Feature/grad stats

#94 thiagopbueno closed 5 years ago
1
Feature/reparametrize dynamics

#93 0xangelo closed 5 years ago
0
Add fetches for MLE loss computation for PG-aware gradients

#92 thiagopbueno opened 5 years ago
0
feat: add critic explained variance statistic

#91 0xangelo closed 5 years ago
0
fix: check obs type when branching factor is 0

#90 0xangelo closed 5 years ago
0
Bugfix/fix time aware

#89 0xangelo closed 5 years ago
0
Bugfix/fix time aware

#88 0xangelo closed 5 years ago
0
Feature/time awareness

#87 0xangelo closed 5 years ago
0
Feature/time awareness

#86 0xangelo closed 5 years ago
0
Feature: add defaults for debugging with episode traces

#85 0xangelo closed 5 years ago
1
Add dynamics_delay option to scripts/mapo

#84 thiagopbueno closed 5 years ago
0
Feature: add dynamics delay and reoder apply ops

#83 0xangelo closed 5 years ago
0
fix: maximize the actor objective

#82 0xangelo closed 5 years ago
0
Fix the sign of model_aware_policy_loss in order to minimize it

#81 thiagopbueno closed 5 years ago
1
Apply dynamics delay in mapo_policy.apply_gradients_with_delays

#80 thiagopbueno closed 5 years ago
1
Add bounded output layer in critic network (parametrized via config flag) for problems with non-positive reward

#79 thiagopbueno opened 5 years ago
1
Plot histograms of trainable variables of actor and critic networks (weights, biases, ...) through training

#78 thiagopbueno opened 5 years ago
0
Plot histograms for model trainable variables (weights, bias, ...)

#77 thiagopbueno closed 5 years ago
0
test: skip exploration tests

#76 0xangelo closed 5 years ago
0
feat: use observed samples when branching factor is 0

#75 0xangelo closed 5 years ago
0
Use true samples in MAPO gradient when branching factor == 0

#74 thiagopbueno closed 5 years ago
0
Parametrize optimizer for model learning

#73 thiagopbueno closed 5 years ago
0
Parametrize optimizers for actor and critic learning

#72 thiagopbueno closed 5 years ago
0
Add optimizer options in Trainer config via scripts/mapo

#71 thiagopbueno closed 5 years ago
0
Add text-based generic logger in MAPOTFCustomEnv class

#70 thiagopbueno closed 5 years ago
0
Bugfix: ignore dynamics network when using the environment's dynamics

#69 0xangelo closed 5 years ago
0
Feature/networks initialization

#68 0xangelo closed 5 years ago
2
Bug fix: unknown `dynamics` key when running on-policy mapo with MLE

#67 thiagopbueno closed 5 years ago
0
Bug fix: ignore gradients for dynamics model in mapo_policy.compute_separate_gradients

#66 thiagopbueno closed 5 years ago
0
Feature/experiments

#65 thiagopbueno closed 5 years ago
0
Properly handle timeout terminations

#64 0xangelo closed 5 years ago
0
Feature/kernels

#63 thiagopbueno closed 5 years ago
1
feat(env): build transition log prob ops in MAPOTFCustomEnv

#62 thiagopbueno closed 5 years ago
2
Compute state transition log likelihood in MAPOTFCustomEnv and subclasses

#61 thiagopbueno closed 5 years ago
0
Feature: add pga losses and losses submodule

#60 0xangelo closed 5 years ago
0
Add kernel option to Trainer config via scripts/MAPO

#59 thiagopbueno closed 5 years ago
1
Add use_true_dynamics option to Trainer config via scripts/MAPO

#58 thiagopbueno closed 5 years ago
0
Add log_level and monitor options in Trainer config via scripts/mapo

#57 thiagopbueno opened 5 years ago
0
Add kernels for computing gradient-aware model learning loss

#56 thiagopbueno closed 5 years ago
0
Feature/custom gym base class

#55 thiagopbueno closed 5 years ago
0
Allow policy access to the environment

#54 0xangelo closed 5 years ago
1
Improvements for feature: on policy ma-dpg

#53 0xangelo closed 5 years ago
0
Add variables initializer to GaussianDynamicsModel

#52 thiagopbueno closed 5 years ago
0