-
scripts/config/main/webrl.yaml:
defaults:
- default
- _self_
save_path: /workspace/WebRL/scripts/output
run_name: "webrl"
critic_lm# training
policy_lm: /workspace/WebRL/webrl-glm-4-9…
-
If I understand the current PPO code correctly, this instantiates completely separate actor and critic models, without any layers shared between them. (But correct me in case that is wrong?)
Instea…
-
## Fix the model test for `soft_actor_critic.py`
1. setup env according to [Run a model under torch_xla2](https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/docs/support_a_new_model…
-
I'm working on a custom PPO agent where the actor learns both the mean and variance of the action distribution. To implement this, I've overridden the `get_action` method and modified the actor's `for…
-
### Issue type
Bug
### Have you reproduced the bug with TensorFlow Nightly?
Yes
### Source
source
### TensorFlow version
V1
### Custom code
Yes
### OS platform and distribution
_No response…
-
Reduce duplication of similar Actors/Critics with only the hidden layers being different - generally improve the readability of the code for creating the networks.
-
Hi,
thanks for the amazing work of RL environments using JAX. I was wondering if you have any plans to write Actor-Critic agents for this work?
-
Hello everyone,
I've encountered a problem while implementing an A2C (Advantage Actor-Critic) network involving Flax and Optax. My network includes _policy_network_ and _value_network_, each containi…
-
Fairseq contains many NMT models but models with Reinforcement Learning are absent.
It would be great if that is added
-
'exp_name': 'puckworld1D_cont_CDiscDDPG', 'env_name': 'puckworld_continuous_1d', 'env_type': 'csuite', 'agent_name': 'CD_DDPG', 'nonlinear': True, 'exp_type': 'control', 'render': False, 'num_runs': 5…
zhrli updated
3 weeks ago