Closed yananchen1989 closed 4 months ago
sft via lora.
should I download https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/blob/main/config.json into the model saved path ?
i also test the code in https://docs.vllm.ai/en/latest/models/lora.html
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest
llm = LLM(model=LOCAL_PATH, enable_lora=True)
where LOCAL_PATH stores:
adapter_config.json
adapter_model.safetensors
README.md
special_tokens_map.json
tokenizer_config.json
tokenizer.json tokenizer.model trainer_state.json
training_args.bin
also, model = AutoModelForCausalLM.from_pretrained(LOCAL_PATH)
works well.
So did this snippet work? -->
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest
llm = LLM(model=LOCAL_PATH, enable_lora=True)
Maybe pass the enable_lora=True
kwarg to the langchain alternative. Another alternative is to merge the lora weights and get a "base" model back, copy the base config, and reload in langchain. But tbh, this is more of a langchain/vllm issue rather than the transformers library itself.
So did this snippet work? -->
from vllm import LLM, SamplingParams from vllm.lora.request import LoRARequest llm = LLM(model=LOCAL_PATH, enable_lora=True)
Maybe pass the
enable_lora=True
kwarg to the langchain alternative. Another alternative is to merge the lora weights and get a "base" model back, copy the base config, and reload in langchain. But tbh, this is more of a langchain/vllm issue rather than the transformers library itself.
this does not work neither. yes, seems that it is a vllm issue.
System Info
vllm version: 0.4.1
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I fine-tuned the mistral-7b-v0.2 model using the trainer of huggingface https://huggingface.co/docs/trl/v0.8.6/trainer the training worked well and finally it saved the model, as bellow:
adapter_config.json adapter_model.safetensors checkpoint-16 checkpoint-24 checkpoint-8 README.md special_tokens_map.json tokenizer_config.json tokenizer.json tokenizer.model training_args.bin
However, when I try to load it back via vllm, it caused error:
however, when I load it via
AutoModelForCausalLM.from_pretrained
, everthing is fine.any advice ?
Expected behavior
It should load it back .