rl-algorithms Search Results

1000+ results
for rl-algorithms

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

google-deepmind/open_spiel #1155

Question: how to evaluate rnad algorithm

Hi. Is there any way to evaluate a model trained with the rnad algorithm against a random agent? (like tic_tac_toe_dqn_vs_tabular.py for example) In tic_tac_toe_dqn_vs_tabular.py, the action is take…

white0721 updated 11 months ago
3
DLR-RM/stable-baselines3 #1770

[Bug:] Cannot use the `fused` flag in default optimizer of P…

The default Adam optimizer has a `fused` flag, which, according to the docs, is significantly faster than the default when used on CUDA. Using it with PPO generates an exception, which complains that …

cmangla updated 11 months ago
15
intelligent-control-lab/guard #5

Episodic Cost Performance not converge/very large for "pilla…

(1). - When tested the code on SCPO methods for Goal_Point_8Hazards, and Goal_Point_8Pillars tasks, only "hazard" task showed convergence of cost performance, not "pillar" related tasks. (see red cos…

Haihan-W updated 9 months ago
6
AgileRL/AgileRL #149

Load a trained model without instantiating an agent

**Is your feature request related to a problem? Please describe.** I notice that currently if we need to load a trained model, we need to first instantiate an agent. Take `PPO` as an example, we need…

Fernadoo updated 1 year ago
3
pytorch/tutorials #2352

💡 [REQUEST] - Port TorchRL `Pendulum` tutorial from pytorch.…

### 🚀 Descirbe the improvement or the new tutorial For historical reasons, TorchRL privately hosts a bunch of tutorials. We'd like to bring the most significant ones to pytorch tutorials for more vi…

vmoens updated 1 year ago
4
lil-lab/cerealbar #6

problems about code and paper

Hi, I have some problems and hope your answer. 1.how to understand gold? Such as gold distributions、gold trajectory……etc. Does it mean oracle？And how did you get these data？ 2.When I run finetune_en…

ChrisRanger updated 9 months ago
22
thu-ml/tianshou #439

[Feature Request] Integrated hyperparameter tuning system

Stable Baselines 3 has natively integrated hyperparameter tuning via https://github.com/DLR-RM/rl-baselines3-zoo. Generally in reinforcement learning research, trying hyperparameter tuning is almost r…

jkterry1 updated 9 months ago
10
jax-ml/jax #19262

XlaRuntimeError in `jit(vmap(dot+concatenate))`

### Description When I use `jit` and `vmap` on a function with `concatenate` and `dot` as below: ```python def f(a: jax.Array, c1: jax.Array, c2: jax.Array) -> jax.Array: '''A common opera…

wang-r-j updated 10 months ago
3
PKU-Alignment/safe-rlhf #141

ppo训练模型出错[BUG]

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…

fzwqq updated 1 year ago
2
ray-project/ray #39751

[RLlib] DreamerV3 fails on environments with continuous acti…

### What happened + What you expected to happen Both the [overview of algorithms](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#) and the [README.md of dreamerv3](https://github.com/ray…

n-mat updated 1 year ago
2

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for rl-algorithms

1000+ results
for rl-algorithms