Open ztfmars opened 4 months ago
@ztfmars Hi
This issues is caused by the mismatch between the version of transformers and peft.
This PR https://github.com/huggingface/peft/pull/1368/files supports the layer_replication for LoraConfig, so we recommend that you can update your peft to v0.10.0 and re-run your merge script.
@ztfmars Hi
This issues is caused by the mismatch between the version of transformers and peft.
This PR https://github.com/huggingface/peft/pull/1368/files supports the layer_replication for LoraConfig, so we recommend that you can update your peft to v0.10.0 and re-run your merge script.
yes, it works! but it have obvious version conflicts between xtuner and lmdeploy on peft, i will try to install another venv env for lmdeploy again and continue.
thx very much!
i use
llava_llama3_8b_instruct_qlora_clip_vit_large_p14_336_lora_e1_finetune.py
to fineture on my dataset, and want to get a llava-llama38b multimodal model on my datasets. after training and pth -> hf,i got llm adapter, visual encoder adapter ,project.
but i can't merge llm +llm adapter together and can'get the LLM weights as turial https://github.com/InternLM/xtuner/tree/main/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336
the error can be listed as following:
additon description:
merge cmd
training configs
from xtuner.dataset import LLaVADataset from xtuner.dataset.collate_fns import default_collate_fn from xtuner.dataset.map_fns import llava_map_fn, template_map_fn_factory from xtuner.dataset.samplers import LengthGroupedSampler from xtuner.engine.hooks import DatasetInfoHook, EvaluateChatHook from xtuner.engine.runner import TrainLoop from xtuner.model import LLaVAModel from xtuner.utils import PROMPT_TEMPLATE
#######################################################################
PART 1 Settings
#######################################################################
Model
llm_name_or_path = '/home/fusionai/.cache/modelscope/hub/LLM-Research/Meta-Llama-3-8B-Instruct' visual_encoder_name_or_path = '/home/fusionai/.cache/modelscope/hub/AI-ModelScope/clip-vit-large-patch14-336'
Specify the pretrained pth
pretrained_pth = '/home/fusionai/project/internllm_demo/llama3/pretrained-model/llama3-llava-iter_2181.pth' # noqa: E501
Data
data_root = '/home/fusionai/project/datasets/llama3_test001/' data_path = data_root + 'repeated_data.json' image_folder = data_root prompt_template = PROMPT_TEMPLATE.llama3_chat max_length = int(2048 - (336 / 14)**2)
Scheduler & Optimizer
batch_size = 1 # per_device accumulative_counts = 1 dataloader_num_workers = 0 max_epochs = 1 optim_type = AdamW lr = 2e-4 betas = (0.9, 0.999) weight_decay = 0 max_norm = 1 # grad clip warmup_ratio = 0.03
Save
save_steps = 500 save_total_limit = 2 # Maximum checkpoints to keep (-1 means unlimited)
Evaluate the generation performance during the training
evaluation_freq = 500 SYSTEM = '' evaluation_images = '/home/fusionai/project/datasets/llama3_test001/imgs/test0001.png' evaluation_inputs = ['此图表示什么逻辑?','图中都有哪些逻辑符号?']
#######################################################################
PART 2 Model & Tokenizer & Image Processor
####################################################################### tokenizer = dict( type=AutoTokenizer.from_pretrained, pretrained_model_name_or_path=llm_name_or_path, trust_remote_code=True, padding_side='right')
image_processor = dict( type=CLIPImageProcessor.from_pretrained, pretrained_model_name_or_path=visual_encoder_name_or_path, trust_remote_code=True)
model = dict( type=LLaVAModel, freeze_llm=True, freeze_visual_encoder=True, pretrained_pth=pretrained_pth, llm=dict( type=AutoModelForCausalLM.from_pretrained, pretrained_model_name_or_path=llm_name_or_path, trust_remote_code=True, torch_dtype=torch.float16, quantization_config=dict( type=BitsAndBytesConfig, load_in_4bit=True, load_in_8bit=False, llm_int8_threshold=6.0, llm_int8_has_fp16_weight=False, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type='nf4')), llm_lora=dict( type=LoraConfig, r=512, lora_alpha=256, lora_dropout=0.05, bias='none', task_type='CAUSAL_LM'), visual_encoder=dict( type=CLIPVisionModel.from_pretrained, pretrained_model_name_or_path=visual_encoder_name_or_path), visual_encoder_lora=dict( type=LoraConfig, r=64, lora_alpha=16, lora_dropout=0.05, bias='none'))
#######################################################################
PART 3 Dataset & Dataloader
####################################################################### llava_dataset = dict( type=LLaVADataset, data_path=data_path, image_folder=image_folder, tokenizer=tokenizer, image_processor=image_processor, dataset_map_fn=llava_map_fn, template_map_fn=dict( type=template_map_fn_factory, template=prompt_template), max_length=max_length, pad_image_to_square=True)
train_dataloader = dict( batch_size=batch_size, num_workers=dataloader_num_workers, dataset=llava_dataset, sampler=dict( type=LengthGroupedSampler, length_property='modality_length', per_device_batch_size=batch_size * accumulative_counts), collate_fn=dict(type=default_collate_fn))
#######################################################################
PART 4 Scheduler & Optimizer
#######################################################################
optimizer
optim_wrapper = dict( type=AmpOptimWrapper, optimizer=dict( type=optim_type, lr=lr, betas=betas, weight_decay=weight_decay), clip_grad=dict(max_norm=max_norm, error_if_nonfinite=False), accumulative_counts=accumulative_counts, loss_scale='dynamic', dtype='float16')
learning policy
More information: https://github.com/open-mmlab/mmengine/blob/main/docs/en/tutorials/param_scheduler.md # noqa: E501
param_scheduler = [ dict( type=LinearLR, start_factor=1e-5, by_epoch=True, begin=0, end=warmup_ratio max_epochs, convert_to_iter_based=True), dict( type=CosineAnnealingLR, eta_min=0.0, by_epoch=True, begin=warmup_ratio max_epochs, end=max_epochs, convert_to_iter_based=True) ]
train, val, test setting
train_cfg = dict(type=TrainLoop, max_epochs=max_epochs)
#######################################################################
PART 5 Runtime
#######################################################################
Log the dialogue periodically during the training process, optional
custom_hooks = [ dict(type=DatasetInfoHook, tokenizer=tokenizer), dict( type=EvaluateChatHook, tokenizer=tokenizer, image_processor=image_processor, every_n_iters=evaluation_freq, evaluation_inputs=evaluation_inputs, evaluation_images=evaluation_images, system=SYSTEM, prompt_template=prompt_template) ]
configure default hooks
default_hooks = dict(
record the time of every iteration.
)
configure environment
env_cfg = dict(
whether to enable cudnn benchmark
)
set visualizer
visualizer = None
set log level
log_level = 'INFO'
load from which checkpoint
load_from = None
whether to resume training from the loaded checkpoint
resume = False
Defaults to use random seed and disable
deterministic
randomness = dict(seed=None, deterministic=False)
set log processor
log_processor = dict(by_epoch=False)