Open TonAnh opened 7 months ago
I just found out the problem is because the convert_lit_checkpoint
function when checking for the conversion support raise NotImplementedError. Is this mean litgpt currently haven't support evaluate the model finetune using adapter_v2 yet?
def check_conversion_supported(lit_weights: Dict[str, torch.Tensor]) -> None:
if any("lora" in wn for wn in lit_weights):
raise ValueError("Checkpoints with LoRA weights cannot be converted. Call `scripts/merge_lora.py` first.")
if any("adapter" in wn or "gating_factor" in wn for wn in lit_weights):
raise NotImplementedError("Converting adapter models is supported.")
Hi @rasbt @carmocca. Can you confirm my doubts? I would greatly appreciate it!
Sorry for the late response, I've been traveling last week and I haven't fully caught up yet.
The adapter methods are not very popular so we haven't prioritized them lately. Let me look into that.
Arg you were right, the problem is the gating factor in the adapter models. That's currently not supported by our conversion tools for the Evaluation Harness. Sorry, this might be something we have to look into some time, but given all the other ongoing PRs, there's unfortunately no concrete timeline for that.
@rasbt Thank you very much. I think I will try to modify the code for my needs. As I understand, the conversion is placed in litgpt/scripts/convert_lit_checkpoint.py
so I should modify that, right? Is there any other file I need to check?
@TonAnh You are correct, this should be the main file. It's essentially converting the LitGPT checkpoint to a HF model to be used in the evaluation harness. In Lit-Llama, the original project that was only focused on the original Llama model instead of more models like LitGPT, the conversion did work for Adapter-finetuned models (https://github.com/Lightning-AI/lit-llama/blob/main/scripts/convert_checkpoint.py). There must have been a special case why we don't support that here yet, but my colleagues who developed the conversion script can perhaps comment more on that.
If you get it to work, a PR would be super appreciated!
Hi, I just notice in the finetune with adapter_v2, we're saving the final model with the name
lit_model.pth.adapter_v2
And this cause a problem when running
Because it couldn't find the 'lit_model.pth' file. I had rename the
lit_model.pth.adapter_v2
tolit_model.pth
and try again but get this errorIs there anything I did wrong when running the evaluate? For more context, the
checkpoint_dir
is theout_dir
when I finetuning, and it includes: