actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

tensorflow/probability #519

With TF 2.0 preview, "old-style" variational layers error wi…

Hi, I have unit tests for `Convolution1DReparameterization` , `Convolution1DReparameterization` etc. that basically look like this ``` x = tf.ones(shape = [150,1]) y = tf.ones(shape = [150]) …

skeydan updated 4 years ago
20
Rintarooo/TSP_DRL_PtrNet #1

possible error in `critic.py` `forward()`?

Very helpful repo! One question, in the `forward` function in `critic.py`, there might possibly be an error: In line `37`, the `Decoder` always takes in the same initial `dec_input` for each city in…

jingweiz updated 3 years ago
5
Stable-Baselines-Team/stable-baselines3-contrib #202

[Feature Request] Hybrid PPO

### 🚀 Feature Hello, in accordance with DLR-RM/stable-baselines3#1624, @SimRey and I would like to implement **Hybrid PPO** in this library. [This](https://arxiv.org/pdf/1903.01344.pdf) is the pa…

AlexPasqua updated 3 weeks ago
3
microsoft/DeepSpeedExamples #556

【problem discuss】Critic Loss can not decrease

Here are my situation: 1. finished step 2 with cohere/zhihu_query dataset. The final reward score is 5.07, rejected score is 0.8, and the acc is 0.79. So the step 2 seems sucessful. 2. when I atte…

watermelon-lee updated 1 year ago
17
microsoft/DeepSpeed #4356

[BUG] RuntimeError(f"{param.ds_summary()} already in registr…

Even though my local copy of repository is up to date I am encountering this error. Log is below. Last line of the log shows the command I run with all the options. Epoch: 0 | Step: 75 | PPO Epoch:…

omeruth updated 8 months ago
13
openai/baselines #953

Allow specifying an activation function for the output layer…

I wanted to test an architecture for PPO2 where the actor and critic share the hidden layers, but the actor's output layer has a `tanh` activation function instead of the default linear one. If I spec…

williamjshipman updated 5 years ago
3
toybox-rs/Toybox #40

try some pytorch models

The following already works with a 'gym': https://github.com/pytorch/examples/blob/master/reinforcement_learning/actor_critic.py

jjfiv updated 3 years ago
3
salesforce/CodeRL #30

Problems in reproducing the RL fine-tuned results

Hi, thanks for open-sourcing your amazing work! I have been trying to reproduce the RL fine-tuned results reported in the paper, but unfortunately, I am encountering some issues. Here is a brief o…

abhik1505040 updated 1 year ago
8
keiohta/tf2rl #50

Implement discrete version of SAC

[Soft Actor-Critic for Discrete Action Settings](https://arxiv.org/abs/1910.07207v1)

keiohta updated 4 years ago
4
LisandraMoura/Mario-kart-RL #2

Ambiente de simulação Mario Kart

#### Testes iniciais - [x] Escolher um repositório de Mario Kart 64 para base de comparação - [x] Testar e ver o funcionamento do repositório - [x] Estudar o repositório (coleta de parâmetros usados)…

LisandraMoura updated 3 weeks ago
3

上一页 1...15 16 17 18 19 20 21...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic