rlaif Search Results - Githubissues

56 results
for rlaif

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

RLHF-V/RLAIF-V #6

Self feedback data generation pipeline & reference model

Hi 2 quick questions, 1. From the paper algorithm1, I get a sense that the algorithm can work in an online divide-n-conquer manner with updated model and I am just curious when the self-feedback co…

charismaticchiu updated 2 months ago
7
modelscope/ms-swift #1734

DPO training error `UnboundLocalError: local variable 'num_p…

**Describe the bug** Getting the following error only by changing the model to `glm4v-9b-chat` from `llava1_6-mistral-7b-instruct` in the first DPO example [here](https://github.com/modelscope/ms-swi…

Lopa07 updated 3 days ago
3
wantedly/machine-learning-round-table #220

[2023/11/15]推薦・機械学習勉強会

## Why 推薦・機械学習勉強会は、推薦や機械学習、その周辺技術を通じてサービスを改善することにモチベーションのある人達の集まりです。ニュースやブログから論文まで、気になったものについてお互い共有しましょう！発信のため、ここは **public** にしてあります。外部からの参加をご希望の方は樋口(https://twitter.com/zerebom_3) まで DM を送るか、…

zerebom updated 10 months ago
5
arXiv/html_feedback #1413

Scripted text

### Description Figure 1 badly rendered. ### (Optional:) Please add any files, screenshots, or other information here. _No response_ ### (Required) What is this issue most closely related to? Sele…

pmeyer-git updated 3 months ago
2
ultralytics/ultralytics #7829

Some NCCL operations have failed or timed out.

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### YOLOv8 Component _No response_ ### Bug …

tamanna-mostafa updated 4 months ago
14
huggingface/peft #1443

size mismatch for base_model.model.model.layers.0.mlp.gate_p…

### System Info transformers version: 4.35.2 Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31 Python version: 3.10.12 Huggingface_hub version: 0.20.2 Safetensors versio…

tamanna-mostafa updated 1 month ago
16
stanfordnlp/dspy #145

Support for continuously valued validation metric "losses"

Is it possible for validation metrics to return non-boolean answers? This could bring a lot more control over the optimization strategy, e.g. loss weighting for different metrics. Another idea i…

elyxlz updated 7 months ago
6
FreedomIntelligence/HuatuoGPT #12

RLAIF 训练细节

请问RLAIF训练细节会开源吗谢谢

hywchina updated 9 months ago
2
pytorch/pytorch #118666

torch.distributed.DistNetworkError: Connection reset by peer

### 🐛 Describe the bug I'm running following deepspeed command for finetuning in my venv: `deepspeed trainer_sft.py --configs llama2-7b-sft-RLAIF --wandb-entity tammosta --show_dataset_stats --dee…

tamanna-mostafa updated 4 months ago
2
huggingface/transformers #28742

safetensors_rust.SafetensorError: Error while deserializing …

### System Info transformers version: 4.35.2 Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31 Python version: 3.10.12 Huggingface_hub version: 0.20.2 Safetensors versio…

tamanna-mostafa updated 7 months ago
5

上一页 1...1 2 3 4 5 6...6 下一页

56 results for rlaif

56 results
for rlaif