dpo Search Results - Githubissues

1000+ results
for dpo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

RLHFlow/Online-RLHF #5

Questions about training data during iterative DPO

Hi, awesome work and thanks for open source! In each iteration, it seems that only using the 20k dataset annotated by reward model. However, the pipeline in the paper shows that the historical data…

hong-xl updated 2 months ago
3
irthomasthomas/undecidability #642

TabbyML: Self-hosted AI coding assistant.

- [ ] [tabby/README.md at main · TabbyML/tabby](https://github.com/TabbyML/tabby/blob/main/README.md?plain=1) # tabby/README.md at main · TabbyML/tabby # 🐾 Tabby [![latest release](https://shield…

irthomasthomas updated 7 months ago
1
huggingface/trl #1858

Processing issue in Anthropic HH dataset

As titled, found some issues: * bad choice of prompt: given Anthropic HH dataset is a multiple conversation loop between Human and Assistant, this line applies left-most split i.e. uses the first Hum…

ZhiyuLi-goog updated 1 month ago
4
irthomasthomas/undecidability #640

README.md · defog/sqlcoder-7b-2 at main

- [ ] [README.md · defog/sqlcoder-7b-2 at main](https://huggingface.co/defog/sqlcoder-7b-2/blob/main/README.md?code=true) # README.md · defog/sqlcoder-7b-2 at main **DESCRIPTION:** ```yaml license:…

irthomasthomas updated 7 months ago
1
huggingface/trl #1310

Error when using multiprocessing in the DPO data preprocessi…

dpo_trainer = DPOTrainer( File "/usr/local/lib/python3.10/dist-packages/trl/trainer/dpo_trainer.py", line 371, in __init__ train_dataset = train_dataset.map(self.tokenize_row, num_proc=sel…

imraviagrawal updated 1 month ago
5
dlenski/tetherback #79

`tetherback` tries to unmount non-mounted partitions, result…

I have [tetherback version 0.9.1](https://github.com/dlenski/tetherback/releases/tag/0.9.1) installed. It tries to unmount partitions that are not mounted, which results in a failure. The script…

dreirund updated 4 months ago
2
QwenLM/Qwen2.5 #827

When I change the model from `Qwen1.5-7B-Chat` to `Qwen2-7B-…

When I change the model from `Qwen1.5-7B-Chat` to `Qwen2-7B-Instruct`, the same error is still there. _Originally posted by @chansonzhang in https://github.com/QwenLM/Qwen/issues/1307…

chansonzhang updated 2 weeks ago
8
mvs-org/metaverse #402

PoS mining delegated staking without supernodes deign to hel…

betachen updated 4 years ago
2
obophenotype/upheno-dev #18

uPheno 2: Alpha review

* Floating terms * FIXED anterior_to posterior_to (ZP) and dorsal_to, in left_side_of, in_right_side_of, * FIXED http://purl.obolibrary.org/obo/BSPO_0000096 * FIXED http://purl…

matentzn updated 4 years ago
5
princeton-nlp/SimPO #23

Unable to reproduce the results of DPO

Hi, Thank you for your work. Could you please tell me what the hyperparameters for Mistral Base 7B DPO (Zephyr), i.e., the vanilla DPO are? I utilized beta=0.01 and beta=0.1 but can not repro…

AGTSAAA updated 2 months ago
4

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for dpo

1000+ results
for dpo