OpenBMB / ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
https://openbmb.github.io/ToolBench/
Apache License 2.0
4.76k stars 402 forks source link

Inference很奇怪 #288

Open Zhang-Henry opened 3 months ago

Zhang-Henry commented 3 months ago

您好,感谢开源!我用了您们开源的权重,但是inference的时候会出现很奇怪的输出。例如:

  1. scripts/inference_toolllama_lora_pipeline_open_domain.sh:
python -u toolbench/inference/qa_pipeline_open_domain.py \
    --tool_root_dir data_0801/toolenv/tools/ \
    --corpus_tsv_path data_0801/retrieval/G1/corpus.tsv \
    --retrieval_model_path data_0801/retriever_model_clean/2024-06-25_07-44-12 \
    --retrieved_api_nums 5 \
    --backbone_model toolllama \
    --model_path huggyllama/llama-7b \
    --lora \
    --lora_path /data/local2/hz624/ToolLLaMA-7b-LoRA-v1 \
    --max_observation_length 1024 \
    --observ_compress_method truncate \
    --method DFS_woFilter_w2 \
    --input_query_file data_0801/instruction/G1_query.json \
    --output_answer_file $OUTPUT_DIR \
    --toolbench_key $TOOLBENCH_KE
/data/local2/hz624/ToolLLaMA-7b-LoRA-v1$ du -sh *
4.0K    adapter_config.json
8.1M    adapter_model.bin
4.0K    README.md
232K    trainer_state.json

训练log:toolllama_lora_open_domain_clean_0801.log

  1. scripts/inference_toolllama_pipeline.sh:
    python -u toolbench/inference/qa_pipeline.py \
    --tool_root_dir /data/local2/hz624/data_new/toolenv/tools/ \
    --backbone_model toolllama \
    --model_path /data/local2/hz624/ToolLLaMA-2-7b-v2 \
    --max_observation_length 1024 \
    --observ_compress_method truncate \
    --method DFS_woFilter_w2 \
    --input_query_file /data/local2/hz624/data_new/instruction/G1_query.json \
    --output_answer_file $OUTPUT_DIR \
    --toolbench_key
/data/local2/hz624/ToolLLaMA-2-7b-v2$ du -sh *
4.0K    config.json
4.0K    generation_config.json
9.2G    pytorch_model-00001-of-00003.bin
9.3G    pytorch_model-00002-of-00003.bin
6.7G    pytorch_model-00003-of-00003.bin
28K     pytorch_model.bin.index.json
4.0K    README.md
4.0K    special_tokens_map.json
4.0K    tokenizer_config.json
492K    tokenizer.model
696K    trainer_state.json

训练log:toolllama-2-7b-v2_dfs_pipeline.log

Line67开始都是胡乱的输出。

看到 #169 提到:

  1. 由于没有正常加载模型、实际在用llama-2做inference导致,可以检查下模型有没有正常加载
  2. 模型输入的prompt与toolllama-2训练时的prompt有出入,可以检查模型所用prompt是否是repo上的最新版本

数据inference_toolllama_lora_pipeline_open_domain使用的是0801版本,数据inference_toolllama_pipeline使用的是data里的。以下是模型读取部分:

def get_backbone_model(self):
  args = self.args
  if args.backbone_model == "toolllama":
      # ratio = 4 means the sequence length is expanded by 4, remember to change the model_max_length to 8192 (2048 * ratio) for ratio = 4
      ratio = int(args.max_sequence_length/args.max_source_sequence_length)
      replace_llama_with_condense(ratio=ratio)
      if args.lora:
          backbone_model = ToolLLaMALoRA(base_name_or_path=args.model_path, model_name_or_path=args.lora_path, max_sequence_length=args.max_sequence_length)
      else:
          backbone_model = ToolLLaMA(model_name_or_path=args.model_path, max_sequence_length=args.max_sequence_length)
  else:
      backbone_model = args.backbone_model
  return backbone_model

不明白是哪里出了问题,希望得到解决,感激不尽!!!