-
Dear maintainers,
Thank you for your valuable arena. I am currently researching the way of LLMs evaluation and got stack with a question about Bradley-Terry model.
As it stands, from multiple sou…
-
Hello and thanks for your work!
While running bradley-terry-rm/llama3_rm.py the final saved model does not have a lm head as the script is using a AutoModelForSequenceClassification model and not …
-
Not indendently, but by using Markov chain Monte Carlo. - should read "independently".
implicitly center the paremters around zero - should read "parameters".
-
@eric-mitchell Will you be adding the implementation for Plackett-Luce rank model in addition to the current Bradley-Terry model?
Looking forward to hearing from you!
-
[paper](https://arxiv.org/abs/2305.18290)
## TL;DR
- **I read this because.. :** 배경지식 차
- **task :** RL
- **problem :** TRPO도 별도의 Reward model을 학습해야 하는데 모델이 커짐에 따라 너무 힘듦
- **idea :** rewa…
-
### Category
Programming
### Website URL
https://chat.lmsys.org/?leaderboard
or its mirror in huggingface:
https://huggingface.co/spaces/lmsys/chatbot--leaderboard
### Website descripti…
-
nice work! starred already.
sorry for asking, why replacing the bos_token with empty string?
sample['positive'] = tokenizer.apply_chat_template(
sample['chosen'], tokenize=False, …
-
Hi,
I have replicated the training and evaluation for the pair_rm model, but I haven't achieved the results reported in Table 2 of the paper. The best results I obtained were with pm_models/llama3-…
-
Hi, congratulations to the great work and thanks for open source!
I am running step 3.2 with pair-preference-model-LLaMA3-8B. However, I encountered the warning "Some weights of LlamaForSequenceCl…
-