deterministic-policy-gradients Search Results

kgex/developer-roadmap #523

Add Deterministic Policy Gradients resource

DineshkumarS05 updated 1 year ago

kgex/developer-roadmap #493

Add Deep Deterministic Policy Gradients (DDPG) resource

DineshkumarS05 updated 1 year ago

PacktPublishing/Deep-Reinforcement-Learning-Hands-On #86

Chaper 14 Deterministic policy gradients results are quite …

In the results of Chapter 14 Deterministic policy gradients in the book, why the training is not very stable and noisy? ------------------- ![擷取](https://user-images.githubusercontent.com/475557…

isu10503054a updated 3 years ago

philtabor/Multi-Agent-Deep-Deterministic-Policy-Gradients #5

I just fixed the problem about backward

here is the solution https://github.com/philtabor/Multi-Agent-Deep-Deterministic-Policy-Gradients/issues/2#issuecomment-912548033

LoveDLWujing updated 2 years ago

Stable-Baselines-Team/stable-baselines3-contrib #10

Implement D4PG

[Distributed Distributional Deterministic Policy Gradients](D4PG) Reference implementation: - https://github.com/deepmind/acme PyTorch implementation: - https://github.com/fabiopardo/tonic

araffin updated 1 year ago

Duane321/reinforcement_learning_for_rideshare_pricing #4

The Reinforcement Learning

The pricing policy has parameters $\theta$s, and our goal is to optimized the simulation in order to produce max profits. To do so, we need to calculate gradient of objective function(profit) w.r…

liux3372 updated 2 months ago

philtabor/Multi-Agent-Deep-Deterministic-Policy-Gradients #9

usage of critic_value_new[dones[:, 0.0]] = 0.0 in learn()

https://github.com/philtabor/Multi-Agent-Deep-Deterministic-Policy-Gradients/blob/a3c294aa6834f348a7401306dff3e67919c861f5/maddpg.py#L74 Hi Phill, Could you please help me to understand what's …

VijiKK updated 2 years ago

xbpeng/awr #4

Train_Return vs Test_Return

Hi, Thank you for sharing the repo! I was wondering how the Train_Return and Test_Return is calculated and what the difference between the two. I see that one is using norm_a_tf and sample_a_tf …

masonjar-source updated 3 years ago

Steffengra/resourceallocation #1

Hi, may I ask which paper code is this?

670555467 updated 1 year ago

pytorch/pytorch #120383

Numeric stability issue with full parameter fine tuning of L…

### 🐛 Describe the bug I was wondering anyone having experiences with full parameter fine tuning of Llama 2 7B model using FSDP can help: I put in all kinds of seeding possible to make training deter…

xiaohai2016 updated 8 months ago

138 results for deterministic-policy-gradients

138 results
for deterministic-policy-gradients