advantage-actor-critic Search Results

302 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

takuseno/d3rlpy #78

using Critic Reguralized Regression (CRR) to predict actions

**Describe the bug** hello: I have trained a model using crr. when I try to predict actions like other mothodes(they dont have problem,everything is same ,except chage the methode) : from d3rlpy.…

icomeaman updated 3 years ago
6
vwxyzjn/cleanrl #47

Issues with applying PPO Impala on Retro Env in regards to …

So what I essentially need is to so have something like "venv = ProcgenEnv(num_envs=" ... but for retro.make(). Running multiple retro environments is causing issues for me, and retrowrapper isn't…

hlsafin updated 3 years ago
20
microsoft/maro #354

On-policy training with multiple actors🐛

## Description When running On-policy with multiple actors, `ExperienceCollectionUtils.stack` will build a list containing in each element the results of `CIMTrajectoryForAC.on_finish`. At this po…

riccardopoiani updated 3 years ago
1
mesjou/human-frictions #18

Investigate training parameter and environment parameter

For simulation we need to decide about parameter of rl algorithm and about parameter of the environment. **Reinforcement Learning** - parallel environments? - neuronal network (size, structure, L…

mesjou updated 3 years ago
2
hill-a/stable-baselines #591

Invalid argument: Self-adjoint eigen decomposition was not s…

**Describe the bug** ACKTR error when trained using GPU **Code example** ```python CSV_IMPUTE = f"data/{asset}_5S_IMPUTED.csv" df = pd.read_csv(CSV_IMPUTE, parse_dates=["created…

putraxor updated 3 years ago
4
mlpack/mlpack #2923

RL methods: sac.hpp is an implementation of TD3, not SAC.

Upon reading the [`sac_impl.hpp`](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/reinforcement_learning/sac_impl.hpp), I realized that it's not an implementation of Soft Actor Critic …

gunnxx updated 3 years ago
5
aidudezzz/deepbots-tutorials #18

Ask for help to solve the problem of increasing memory. This…

#状态输入是相机图像 import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import torchvision.models as models import gc from torch.distributions import Categorical from torch…

HongYegg updated 3 years ago
2
thu-ml/tianshou #317

Suggestion: Abandon name 'vpg' but use REINFORCE to replace …

As is stated in #307 , as far as what I know, vpg itself is not a very formal algorithm in the literature (first appears in Spinningup's docs, I think) and is loosely defined. In SpinningUp's impleme…

ChenDRAG updated 3 years ago
1
starry-sky6688/MARL-Algorithms #30

为什么QMIX这种Value-Based普遍好于AC+Attention？

直接上感觉不应该好这么多其次Attention的方法为什么后期会震荡导致下降？

hijkzzz updated 3 years ago
7
ray-project/ray #9218

[rllib] num_gpus=0 on a device with GPU uses GPU for Pytorch

### What is the problem? Ray will find a GPU and place the model (e.g. FCNet) on the GPU even when `num_gpus=0`. Stack Trace: ``` ../../miniconda3/envs/ray/lib/python3.7/site-packages/ray/rl…

michaelzhiluo updated 3 years ago
4

上一页 1...19 20 21 22 23 24 25...31 下一页

302 results for advantage-actor-critic

302 results
for advantage-actor-critic