togethercomputer / OpenChatKit

Apache License 2.0
9.01k stars 1.01k forks source link

LoRA script missing #127

Closed zarandioon closed 1 year ago

zarandioon commented 1 year ago

Describe the bug Your blog refers to a LoRA script in the following location, but this does not exist. Can you please look into this?

/training/lora/redpajama-incite-chat-3b.py

orangetin commented 1 year ago

Hey @zarandioon , you can find the LoRa scripts in the lora branch for now. There is a PR open :)

alexanderfrey commented 1 year ago

@orangetin Thanks for the lora training scripts. I have 2 questions

  1. Do you have an example for merging the lora weights with the original model ? From what I understand that is important in order to use chat frameworks like https://github.com/huggingface/text-generation-inference
  2. A quick test for the results after training showed that the model does not properly stop generating text. Do you append <|endoftext|> to every training example ?

Thanks & kind regards Alexander

orangetin commented 1 year ago

Hey @alexanderfrey ,

  1. This example will merge the lora weights to the base model:
    
    import torch
    from peft import PeftModel, PeftConfig
    from transformers import AutoModelForCausalLM

peft_model_path ='PEFT_MODEL_OUTPUT_DIR'

config = PeftConfig.from_pretrained(peft_model_path) model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, device_map='auto')

Load the Lora model

model = PeftModel.from_pretrained(model, peft_model_path)

model = model.merge_and_unload()

model.save_pretrained('MERGED_MODEL')



2. You should set `<human>` as the stop word for the RedPajama models (use the OIG dataset as an example for formatting your data). You could fine-tune the model with `<|endoftext|>` at the end of every training example, but that's up to you :)

Let me know if you have any other questions.

Closing issue as the LoRA example scripts have been merged into main: https://github.com/togethercomputer/OpenChatKit/tree/3eba68969f137c34cea92d55c0a040f1b16de3db/training/lora/example