deterministic-policy-gradients Search Results

138 results
for deterministic-policy-gradients

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Tencent/PocketFlow #149

Which AutoML method is used in the current version of Pocket…

Hi, you mentioned in Github that Gaussian Processes (GP, Mockus, 1975), Tree-structured Parzen Estimator (TPE, Bergstra et al., 2013), and Deterministic Deep Policy Gradients (DDPG, Lillicrap et al.,…

Moran232 updated 1 year ago
2
sezan92/sezan92.github.io #14

Blog reinforce Discrete method

## Objective After discrete reinforce method of Reinforcement learning algorithm has been implemented. The next task is to make a blog about reinforce method. This issue is to work on that ## Tas…

sezan92 updated 1 year ago
43
SciSharp/TensorFlow.NET #760

GradientTape.gradient returning null

I am porting the Keras [Actor Critic Method](https://keras.io/examples/rl/actor_critic_cartpole/#visualizations) to Tensforflow.net and when I attempt to calculate the gradients it returns null. ``…

alexhiggins732 updated 1 year ago
12
thu-ml/tianshou #742

In PPOPolicy, the ratio is computed with requires_grad `True…

The bug will cause gradient exploding when add action mask in the dist_fn.

imerme updated 1 year ago
4
DLR-RM/stable-baselines3 #1127

[Bug]: Action Space 'clipped' at 1 in basic cases clips many…

### 🐛 Bug **So I'm writing this as a bug , though it seems to be something partly intentional. That said I think it may have quite significant (dire) impact on some trainings.** When a basic env…

arminvburren updated 1 year ago
4
HumanCompatibleAI/imitation #420

Randomness control for different `exploration_frac` in prefe…

## Background - Previously, [`exploration_frac`](https://github.com/HumanCompatibleAI/imitation/blob/3d7a76b8c587a25e380aeb09f65b764d7693aeea/src/imitation/algorithms/preference_comparisons.py#L212) …

yawen-d updated 2 years ago
5
DLR-RM/stable-baselines3 #1046

[Question] Custom action space with PPO

### Question Hello, is it possible to create a custom action space to use with PPO? From what I read in the documentation, there are limited `Space` instances allowed. But that means that I have…

tfederico updated 2 years ago
5
typst/typst #1056

Integration with external tools

## Motivation This RFC discusses mechanism to interact with tools or data outside the Typst document and its direct file system environment to. Possible use cases: - Use external tooling, e.g. to ge…

laurmaedje updated 11 months ago
126
probml/pml2-book #201

Word repetitions (PDF Version: 2023-01-02)

Found a bunch of repeated words, many of which appear to be erroneous (e.g. "An event is an an element"). I haven't checked all of them though. Repeated token 'an' at: discuss in Section 2.1.1.4…

gdemelo updated 1 year ago
2
ray-project/ray #18758

[Feature][rllib/tune] Deprecate RLLib's rollout/evaluate in …

### Search before asking - [X] I had searched in the [issues](https://github.com/ray-project/ray/issues) and found no similar feature requirement. ### Description Currently, there's a major gap, o…

andras-kth updated 2 years ago
19

上一页 1...5 6 7 8 9 10 11...14 下一页

138 results for deterministic-policy-gradients

138 results
for deterministic-policy-gradients