issues
search
thiagopbueno
/
model-aware-policy-optimization
MAPO: Model-Aware Policy Optimization algorithm
GNU General Public License v3.0
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Feature/navigation 1d
#101
thiagopbueno
closed
5 years ago
0
feat: use target models in mapdg gradient for dynamics loss
#100
0xangelo
closed
5 years ago
0
Add a DynamicsSubnets model class
#99
thiagopbueno
opened
5 years ago
0
Add a Gaussian dynamics model base class
#98
thiagopbueno
opened
5 years ago
0
feat: add histograms and clear terminal output
#97
0xangelo
closed
5 years ago
0
Feature/params histogram
#96
0xangelo
closed
5 years ago
0
ci: parallelize tests
#95
0xangelo
closed
5 years ago
0
Feature/grad stats
#94
thiagopbueno
closed
5 years ago
1
Feature/reparametrize dynamics
#93
0xangelo
closed
5 years ago
0
Add fetches for MLE loss computation for PG-aware gradients
#92
thiagopbueno
opened
5 years ago
0
feat: add critic explained variance statistic
#91
0xangelo
closed
5 years ago
0
fix: check obs type when branching factor is 0
#90
0xangelo
closed
5 years ago
0
Bugfix/fix time aware
#89
0xangelo
closed
5 years ago
0
Bugfix/fix time aware
#88
0xangelo
closed
5 years ago
0
Feature/time awareness
#87
0xangelo
closed
5 years ago
0
Feature/time awareness
#86
0xangelo
closed
5 years ago
0
Feature: add defaults for debugging with episode traces
#85
0xangelo
closed
5 years ago
1
Add dynamics_delay option to scripts/mapo
#84
thiagopbueno
closed
5 years ago
0
Feature: add dynamics delay and reoder apply ops
#83
0xangelo
closed
5 years ago
0
fix: maximize the actor objective
#82
0xangelo
closed
5 years ago
0
Fix the sign of model_aware_policy_loss in order to minimize it
#81
thiagopbueno
closed
5 years ago
1
Apply dynamics delay in mapo_policy.apply_gradients_with_delays
#80
thiagopbueno
closed
5 years ago
1
Add bounded output layer in critic network (parametrized via config flag) for problems with non-positive reward
#79
thiagopbueno
opened
5 years ago
1
Plot histograms of trainable variables of actor and critic networks (weights, biases, ...) through training
#78
thiagopbueno
opened
5 years ago
0
Plot histograms for model trainable variables (weights, bias, ...)
#77
thiagopbueno
closed
5 years ago
0
test: skip exploration tests
#76
0xangelo
closed
5 years ago
0
feat: use observed samples when branching factor is 0
#75
0xangelo
closed
5 years ago
0
Use true samples in MAPO gradient when branching factor == 0
#74
thiagopbueno
closed
5 years ago
0
Parametrize optimizer for model learning
#73
thiagopbueno
closed
5 years ago
0
Parametrize optimizers for actor and critic learning
#72
thiagopbueno
closed
5 years ago
0
Add optimizer options in Trainer config via scripts/mapo
#71
thiagopbueno
closed
5 years ago
0
Add text-based generic logger in MAPOTFCustomEnv class
#70
thiagopbueno
closed
5 years ago
0
Bugfix: ignore dynamics network when using the environment's dynamics
#69
0xangelo
closed
5 years ago
0
Feature/networks initialization
#68
0xangelo
closed
5 years ago
2
Bug fix: unknown `dynamics` key when running on-policy mapo with MLE
#67
thiagopbueno
closed
5 years ago
0
Bug fix: ignore gradients for dynamics model in mapo_policy.compute_separate_gradients
#66
thiagopbueno
closed
5 years ago
0
Feature/experiments
#65
thiagopbueno
closed
5 years ago
0
Properly handle timeout terminations
#64
0xangelo
closed
5 years ago
0
Feature/kernels
#63
thiagopbueno
closed
5 years ago
1
feat(env): build transition log prob ops in MAPOTFCustomEnv
#62
thiagopbueno
closed
5 years ago
2
Compute state transition log likelihood in MAPOTFCustomEnv and subclasses
#61
thiagopbueno
closed
5 years ago
0
Feature: add pga losses and losses submodule
#60
0xangelo
closed
5 years ago
0
Add kernel option to Trainer config via scripts/MAPO
#59
thiagopbueno
closed
5 years ago
1
Add use_true_dynamics option to Trainer config via scripts/MAPO
#58
thiagopbueno
closed
5 years ago
0
Add log_level and monitor options in Trainer config via scripts/mapo
#57
thiagopbueno
opened
5 years ago
0
Add kernels for computing gradient-aware model learning loss
#56
thiagopbueno
closed
5 years ago
0
Feature/custom gym base class
#55
thiagopbueno
closed
5 years ago
0
Allow policy access to the environment
#54
0xangelo
closed
5 years ago
1
Improvements for feature: on policy ma-dpg
#53
0xangelo
closed
5 years ago
0
Add variables initializer to GaussianDynamicsModel
#52
thiagopbueno
closed
5 years ago
0
Next