How to use it. Is there some code examples?

For training:

Training with RRHF You can train your own model with generated or released datasets using the script train.sh, please note that the training process requires 8*A100 80GB GPUs, bf16 and FSDP. In the future, we will try efficient training methods such as LoRA or Prefix-tuning or Adapter to lower the computational resource requirements.

bash ./train.sh

For using Wombat: Use recover_wombat_7b.sh and single_sentence_inference.py

GanjinZero / RRHF

How to use it. Is there some code examples? #28