-
Hi, awesome work and thanks for open source!
In each iteration, it seems that only using the 20k dataset annotated by reward model. However, the pipeline in the paper shows that the historical data…
-
- [ ] [tabby/README.md at main · TabbyML/tabby](https://github.com/TabbyML/tabby/blob/main/README.md?plain=1)
# tabby/README.md at main · TabbyML/tabby
# 🐾 Tabby
[![latest release](https://shield…
-
As titled, found some issues:
* bad choice of prompt: given Anthropic HH dataset is a multiple conversation loop between Human and Assistant, this line applies left-most split i.e. uses the first Hum…
-
- [ ] [README.md · defog/sqlcoder-7b-2 at main](https://huggingface.co/defog/sqlcoder-7b-2/blob/main/README.md?code=true)
# README.md · defog/sqlcoder-7b-2 at main
**DESCRIPTION:**
```yaml
license:…
-
dpo_trainer = DPOTrainer(
File "/usr/local/lib/python3.10/dist-packages/trl/trainer/dpo_trainer.py", line 371, in __init__
train_dataset = train_dataset.map(self.tokenize_row, num_proc=sel…
-
I have [tetherback version 0.9.1](https://github.com/dlenski/tetherback/releases/tag/0.9.1) installed.
It tries to unmount partitions that are not mounted, which results in a failure.
The script…
-
When I change the model from `Qwen1.5-7B-Chat` to `Qwen2-7B-Instruct`, the same error is still there.
_Originally posted by @chansonzhang in https://github.com/QwenLM/Qwen/issues/1307…
-
-
* Floating terms
* FIXED anterior_to posterior_to (ZP) and dorsal_to, in left_side_of, in_right_side_of,
* FIXED http://purl.obolibrary.org/obo/BSPO_0000096
* FIXED http://purl…
-
Hi,
Thank you for your work.
Could you please tell me what the hyperparameters for Mistral Base 7B DPO (Zephyr), i.e., the vanilla DPO are?
I utilized beta=0.01 and beta=0.1 but can not repro…