rlaif Search Results - Githubissues

RLHF-V/RLAIF-V #21

Would the data generation code be released?

In this issue, it is said that related code would be released at https://github.com/RLHF-V/RLAIF-V/issues/6, but I find this [link](https://github.com/RLHF-V/RLAIF-V#data-generation) is empty. Where c…

Gaffey updated 6 hours ago

kiddyboots216/lottery-ticket-adaptation #1

Steps to replicate the results given in Table 3 and 4

Hi, can you provide the steps, settings and dataset that can be used to replicate the results of Lotto method given in Table 3 and 4? Is the instruction following dataset among `https://huggingface.c…

au-revoir updated 1 day ago

kiddyboots216/lottery-ticket-adaptation #2

About mask generating and adaptation

Hello, Ashwinee Panda I was very impressed with your work and wanted to thank you for the excellent contribution. I am currently following the tutorial using the openbookqa task to finally experime…

HeeseongEom updated 2 weeks ago

RLHF-V/RLAIF-V #11

The LoRA training codes and scripts

A significant achievement in aligning Vision-Language Models! While running the code 'RLAIF-V/muffin/train/train_llava15.py', I noticed that all model parameters are trainable. Due to hardware limi…

darkpromise98 updated 1 month ago

wangclnlp/Vision-LLM-Alignment #4

Could you tell me the details of preference dataset.

{ "id": "000000245946", "image": "000000245946.jpg", "conversations": [ { "from": "human", "value": "\nWhat considerations…

hhhhzzzzz updated 8 hours ago

PKU-Alignment/align-anything #40

[Question] LLaVA DPO training loss increases

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/align-anything/issues) and [Discussions](https://github.com…

fangqi-Zhu updated 2 weeks ago

Liang-Jiaying/RLAIF #3

Questions to research and think

- [ ] Why the author only compare RLAIF with RLHF on task of summarization? - [ ] How are the performances for other tasks? - [ ] For 4.1 Datasets, what other ways OpenAI use to filter the data? - …

Liang-Jiaying updated 9 months ago

RLHF-V/RLAIF-V #25

the actual number of samples of the huggingface RLAIF-V-Data…

Hi, there~ After reading the parquets files of the RLAIF-V-Dataset downloaded from Hugging Face, I actually got 83k samples, which is significantly more than the "30k data" mentioned in the README…

Molly-3000 updated 1 day ago

RLHF-V/RLAIF-V #20

When I use the cal_logp of all dataset,I met the question.Th…

![img_v3_02do_b346611a-ead2-4f84-9dd9-2b74cdc77afg](https://github.com/user-attachments/assets/68cee71a-4563-4597-b27b-4a26b3552c19)

XiaoLei2123 updated 1 day ago

huggingface/trl #1939

Unexpected Keyword Argument 'add_special_tokens' in LlavaPro…

Error: TypeError: LlavaProcessor.__call__() got an unexpected keyword argument 'add_special_tokens'. When trying to run https://github.com/huggingface/trl/blob/main/examples/scripts/dpo_visual.py with…

srikant86panda updated 1 week ago

50 results for rlaif

50 results
for rlaif