rlaif Search Results - Githubissues

71 results
for rlaif

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

RLHF-V/RLAIF-V #6

Self feedback data generation pipeline & reference model

Hi 2 quick questions, 1. From the paper algorithm1, I get a sense that the algorithm can work in an online divide-n-conquer manner with updated model and I am just curious when the self-feedback co…

charismaticchiu updated 4 months ago
7
stanfordnlp/dspy #145

Support for continuously valued validation metric "losses"

Is it possible for validation metrics to return non-boolean answers? This could bring a lot more control over the optimization strategy, e.g. loss weighting for different metrics. Another idea i…

elyxlz updated 9 months ago
6
pytorch/pytorch #118666

torch.distributed.DistNetworkError: Connection reset by peer

### 🐛 Describe the bug I'm running following deepspeed command for finetuning in my venv: `deepspeed trainer_sft.py --configs llama2-7b-sft-RLAIF --wandb-entity tammosta --show_dataset_stats --dee…

tamanna-mostafa updated 7 months ago
2
FreedomIntelligence/HuatuoGPT #12

RLAIF 训练细节

请问RLAIF训练细节会开源吗谢谢

hywchina updated 11 months ago
2
daniel-furman/sft-demos #2

Roadmap to include SFT + Alignment

Hi @daniel-furman , I want to start off by saying this is a really cool repo! These scripts are extremely useful to a novice starting off with these libraries. I mostly just see SFT notebooks. Do you…

vikram71198 updated 1 year ago
1
huggingface/transformers #28742

safetensors_rust.SafetensorError: Error while deserializing …

### System Info transformers version: 4.35.2 Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31 Python version: 3.10.12 Huggingface_hub version: 0.20.2 Safetensors versio…

tamanna-mostafa updated 9 months ago
5
crewAIInc/crewAI #52

Not finding a useful Ollama model for stock example

Im running the stock example with open source models on Ollama. I have tried various open models (did not write my own modelfile) and it always ended up either delivering bad outputs, generating synta…

abdinal1 updated 10 months ago
7
meta-llama/llama #568

Change license so Using Llama output to fine tune Galactica …

For research purposes, it would be great to use Llama2 70B like ChatGPT api to generate data for fine tuning Galactica. To my understanding, it is only allowed to use output of llama2 to fine tune oth…

KnutJaegersberg updated 1 year ago
2
huggingface/peft #699

Add ability to save only selected adapters

### Feature request I would like the ability to save only selected adapters of a model using the `model.save_pretrained()` method, possibly renaming them (saving one selected adapter as "default", …

crowsonkb updated 1 year ago
1
huggingface/trl #270

Multi dimensional Score possible?

Hi there, First of all thanks for the great library! I was wondering If it is possible to parse a multidimensional Score? Example. I want to judge my model in 3 categories: 1) positivity, 2) grammar,…

achibb updated 1 year ago
2

上一页 1...2 3 4 5 6 7 8...8 下一页

71 results for rlaif

71 results
for rlaif