-
The ability to create discussion between different LLMs would be a cool addition. Not sure about the use case but I tried it and came out well. Here is the [gist](https://gist.github.com/Siddhesh-Agar…
-
Hi! Is there a specific reason that we train the reward model based on absolute scores rather than pairwise human preferences on the same prompts, as most of the other rlhf work?
-
### Client Version:
515.1642
### Issue Summary:
Doing an eye surgery never actually gives anyone their vision back. Replacing their eyes does not help, nor does Oculine. I had to delete and re-crea…
-
preferences, beliefs and values aren't as neatly separate in deliberative politics as in game theory (careful here, values is the wrong term to begin with).
Caplan scorns some of this confusion as _p…
-
Hi,
Thank you for your interesting work "Improving Generalization of Alignment with Human Preferences through Group Invariant Learning (ICLR 2024 Spotlight)"! I want to ask where I can find the Gi…
-
There was recently a discussion on Discord whereby someone was unable to change the font size on specific syntaxes. It initially seemed related to https://github.com/SublimeTextIssues/Core/issues/1551…
-
# URL
- https://arxiv.org/abs/2305.18290
# Affiliations
- Rafael Rafailov, N/A
- Archit Sharma, N/A
- Eric Mitchell, N/A
- Stefano Ermon, N/A
- Christopher D. Manning, N/A
- Chelsea Finn, …
-
Hi @mrahtz , thanks for doing this repo! I thought it might be useful to you or others to pass along some extra stuff I had to do to get this running on a fresh Ubuntu 18.04 install. Feel free to dele…
-
Hi @mrahtz , thanks for doing this repo! I think this algorithm is a milestone in the process of deep reinforcement learning.
We installed all components according to the pipfile and pipfile.lock fi…
-
searx goes back to defult preferences whenever I try to navigate to another mode (images/videos/it/etc), or the next page, or alter the search query...and it's always in gibberish, there's no way to m…