-
So, I have a custom AEC Environment written in PettingZoo that has an action_mask (it's a board game with a large amount of "legal moves"; so it's necessary to mask out illegal moves during training),…
-
I'm not sure this needs to, or should be addressed, but I wanted to make sure people are aware of the issue:
In C++ a compare_exchange that fails is treated as a load for memory model issues. "If t…
-
Hi, I've got the testing result after your kindly instruction and thank you again. But the result is weird, here are the results :
Corpus mode: Yelp
Pair mode: semantic
Epoch: 0 supervised loss…
-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/safe-rlhf/issues) and [Discussions](https://github.com/PKU-…
-
Hi,
Thanks so much for maintaining this very easy-to-use library!
I wanted to report what I think is a bug during overwriting of transitions in the (FIFO) Buffer once it is full.
The `prev_transiti…
-
## Describe the bug
I am currently trying to use the `DiscreteSACLoss` in a multi-agent environment. I am currently following this [tutorial](https://pytorch.org/rl/tutorials/multiagent_ppo.html). …
-
Hi all,
I was wondering if the PPO-based MARL algorithms you use in the paper are taken from RLlib or whether they are already available in the library without the need of an RLlib interface.
I …
-
It is very helpful to experience reinforcement learning simulation using your examples.
I'm running through your examples, but it's hard to see that reinforcement learning is working.
How many times…
-
With the growing complexity of the RL/ML side of this project, the training code is starting to hit the limits of the current TFJS/Node capabilities, having to re-implement some frameworks/algorithms …
-
I'm trying to understand the following code fragment in the MPU6050_DMP6 example
```
// check for overflow (this should never happen unless our code is too inefficient)
```
if ((mpuIntStatus…