Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
https://lightning.ai
Apache License 2.0
10.59k stars 1.05k forks source link

Problem when evaluating finetune model using adapter_v2 #1266

Open TonAnh opened 7 months ago

TonAnh commented 7 months ago

Hi, I just notice in the finetune with adapter_v2, we're saving the final model with the name lit_model.pth.adapter_v2

    # Save the final Adapter checkpoint at the end of training
    save_path = out_dir / "final" / "lit_model.pth.adapter_v2"
    save_path.parent.mkdir(parents=True, exist_ok=True)
    save_adapter_v2_checkpoint(fabric, model, save_path)
    if fabric.global_rank == 0:
        # Copy checkpoint files from original checkpoint dir
        copy_config_files(checkpoint_dir, save_path.parent)
        save_hyperparameters(setup, save_path.parent)
        save_prompt_style(data.prompt_style, save_path.parent)

And this cause a problem when running

litgpt evaluate \
  --checkpoint_dir out/finetune/adapterv2-stablelm/final \
  --out_dir out/evaluate/adapterv2-stablelm3b \
  --tasks "truthfulqa_mc2,hellaswag,mmlu" \
  --batch_size 4 \
  --seed 42 \
  --save_filepath out/result \

Because it couldn't find the 'lit_model.pth' file. I had rename the lit_model.pth.adapter_v2 to lit_model.pth and try again but get this error

litgpt/scripts/convert_lit_checkpoint.py", line 238, in check_conversion_supported        
    raise NotImplementedError("Converting adapter models is supported.")                                       
NotImplementedError: Converting adapter models is supported.         

Is there anything I did wrong when running the evaluate? For more context, the checkpoint_dir is the out_dir when I finetuning, and it includes:

generation_config.json
hyperparameters.yaml
lit_model.pth.adapter_v2
model_config.yaml
prompt_style.yaml
tokenizer_config.json
tokenizer.json
TonAnh commented 6 months ago

I just found out the problem is because the convert_lit_checkpoint function when checking for the conversion support raise NotImplementedError. Is this mean litgpt currently haven't support evaluate the model finetune using adapter_v2 yet?

def check_conversion_supported(lit_weights: Dict[str, torch.Tensor]) -> None:
    if any("lora" in wn for wn in lit_weights):
        raise ValueError("Checkpoints with LoRA weights cannot be converted. Call `scripts/merge_lora.py` first.")
    if any("adapter" in wn or "gating_factor" in wn for wn in lit_weights):
        raise NotImplementedError("Converting adapter models is supported.")
TonAnh commented 6 months ago

Hi @rasbt @carmocca. Can you confirm my doubts? I would greatly appreciate it!

rasbt commented 6 months ago

Sorry for the late response, I've been traveling last week and I haven't fully caught up yet.

The adapter methods are not very popular so we haven't prioritized them lately. Let me look into that.

rasbt commented 6 months ago

Arg you were right, the problem is the gating factor in the adapter models. That's currently not supported by our conversion tools for the Evaluation Harness. Sorry, this might be something we have to look into some time, but given all the other ongoing PRs, there's unfortunately no concrete timeline for that.

TonAnh commented 6 months ago

@rasbt Thank you very much. I think I will try to modify the code for my needs. As I understand, the conversion is placed in litgpt/scripts/convert_lit_checkpoint.py so I should modify that, right? Is there any other file I need to check?

rasbt commented 6 months ago

@TonAnh You are correct, this should be the main file. It's essentially converting the LitGPT checkpoint to a HF model to be used in the evaluation harness. In Lit-Llama, the original project that was only focused on the original Llama model instead of more models like LitGPT, the conversion did work for Adapter-finetuned models (https://github.com/Lightning-AI/lit-llama/blob/main/scripts/convert_checkpoint.py). There must have been a special case why we don't support that here yet, but my colleagues who developed the conversion script can perhaps comment more on that.

If you get it to work, a PR would be super appreciated!