-
### Description
Start a turbo matchmaking, click accept, it makes the sound like im entering the lobby, and nothing happens. It looks like i can rejoin, but it just makes that dunk "lobby join" sound…
-
Our shootout performance at the robocup 2023 competition ... leaves room for improvement.
The focus here is the defending capabilities of the robot, especially jumping in the right direction at the …
-
We currently update penalties as follows:
auto const diff = params.targetFeasible - feasPct;
if (-0.05 < diff && diff < 0.05) // allow some margins on the difference
return pen…
-
Hi Team,
While using the PPO pipeline we observe at times spikes in response length and were curious if any techniques related to length penalty is available or explored
-
### What happened?
Sometimes the part of the initial prompt that should be considered for the penalties is ignored. Only the newly generated tokens are used for calculating penalty. For now I can ass…
-
**Github username:** --
**Twitter username:** --
**Submission hash (on-chain):** 0xe2f5045df3b2ba5395f8efc92f548f7ce92cca2348ac930c0418087cabfa8709
**Severity:** medium
**Description:**
**Descriptio…
-
ChatGPT is providing settings to have more control over the execution of the promt.
Maybe it is possible to make these settings available per promt-
With these settings, you can control:
1. **Tem…
-
Hello!
Thank you for this library and for your effort maintaining it for so long!
In using this library, I have recently concluded that three of the four penalty rules for selecting a mask patt…
-
### Question
when i try to use repetition_penalty to avoid repeat answer, i met this error "cuda error:device-side assert triggered". After my debug, i found that the input_ids include -200 which is…
-
### Feature request
The `OnPolicyConfig` has a flag: `non_eos_penalty: bool = False`, which is described as: `"""whether to penalize responses that do not contain stop_token_id"""`
I interpreted…