-
Hi, you mentioned in Github that Gaussian Processes (GP, Mockus, 1975), Tree-structured Parzen Estimator (TPE, Bergstra et al., 2013), and Deterministic Deep Policy Gradients (DDPG, Lillicrap et al.,…
-
## Objective
After discrete reinforce method of Reinforcement learning algorithm has been implemented. The next task is to make a blog about reinforce method. This issue is to work on that
## Tas…
-
I am porting the Keras [Actor Critic Method](https://keras.io/examples/rl/actor_critic_cartpole/#visualizations) to Tensforflow.net and when I attempt to calculate the gradients it returns null.
``…
-
The bug will cause gradient exploding when add action mask in the dist_fn.
-
### 🐛 Bug
**So I'm writing this as a bug , though it seems to be something partly intentional. That said I think it may have quite significant (dire) impact on some trainings.**
When a basic env…
-
## Background
- Previously, [`exploration_frac`](https://github.com/HumanCompatibleAI/imitation/blob/3d7a76b8c587a25e380aeb09f65b764d7693aeea/src/imitation/algorithms/preference_comparisons.py#L212) …
-
### Question
Hello,
is it possible to create a custom action space to use with PPO? From what I read in the documentation, there are limited `Space` instances allowed. But that means that I have…
-
## Motivation
This RFC discusses mechanism to interact with tools or data outside the Typst document and its direct file system environment to. Possible use cases:
- Use external tooling, e.g. to ge…
-
Found a bunch of repeated words, many of which appear to be erroneous (e.g. "An event is an an element"). I haven't checked all of them though.
Repeated token 'an' at:
discuss in Section 2.1.1.4…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/ray-project/ray/issues) and found no similar feature requirement.
### Description
Currently, there's a major gap, o…