报错信息:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 92, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 246, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 205, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 622, in get_model
return FlashQwen2(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_qwen2.py", line 72, in init
model = Qwen2ForCausalLM(config, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_qwen2_modeling.py", line 351, in init
self.lm_head = SpeculativeHead.load(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 615, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 654, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 99, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 63, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist
报错信息: Traceback (most recent call last): File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 92, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 246, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 205, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 622, in get_model
return FlashQwen2(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_qwen2.py", line 72, in init
model = Qwen2ForCausalLM(config, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_qwen2_modeling.py", line 351, in init
self.lm_head = SpeculativeHead.load(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 615, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 654, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 99, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 63, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist
微调参数:
model
model_name_or_path: qwen2-1.5b
method
stage: dpo do_train: true finetuning_type: full pref_beta: 0.1 pref_loss: simpo # choices: [sigmoid (dpo), orpo, simpo] pref_ftx: 0.5
simpo_gamma: 0.6
dpo_label_smoothing: 0.1
dataset
dataset: zk_dpo template: empty cutoff_len: 1024 overwrite_cache: true preprocessing_num_workers: 16
output
output_dir: saves/zk/dpo logging_steps: 10 save_steps: 5000 plot_loss: true overwrite_output_dir: true
train
per_device_train_batch_size: 8 gradient_accumulation_steps: 2 learning_rate: 5.0e-6 num_train_epochs: 1.0 lr_scheduler_type: cosine warmup_ratio: 0.1 bf16: true ddp_timeout: 180000000 report_to: wandb