-
Hi 2 quick questions,
1. From the paper algorithm1, I get a sense that the algorithm can work in an online divide-n-conquer manner with updated model and I am just curious when the self-feedback co…
-
Is it possible for validation metrics to return non-boolean answers?
This could bring a lot more control over the optimization strategy, e.g. loss weighting for different metrics.
Another idea i…
-
### 🐛 Describe the bug
I'm running following deepspeed command for finetuning in my venv:
`deepspeed trainer_sft.py --configs llama2-7b-sft-RLAIF --wandb-entity tammosta --show_dataset_stats --dee…
-
请问RLAIF训练细节会开源吗 谢谢
-
Hi @daniel-furman , I want to start off by saying this is a really cool repo! These scripts are extremely useful to a novice starting off with these libraries.
I mostly just see SFT notebooks. Do you…
-
### System Info
transformers version: 4.35.2
Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31
Python version: 3.10.12
Huggingface_hub version: 0.20.2
Safetensors versio…
-
Im running the stock example with open source models on Ollama. I have tried various open models (did not write my own modelfile) and it always ended up either delivering bad outputs, generating synta…
-
For research purposes, it would be great to use Llama2 70B like ChatGPT api to generate data for fine tuning Galactica. To my understanding, it is only allowed to use output of llama2 to fine tune oth…
-
### Feature request
I would like the ability to save only selected adapters of a model using the `model.save_pretrained()` method, possibly renaming them (saving one selected adapter as "default", …
-
Hi there, First of all thanks for the great library!
I was wondering If it is possible to parse a multidimensional Score? Example. I want to judge my model in 3 categories: 1) positivity, 2) grammar,…