Closed skyCreateXian closed 1 month ago
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."
This issue was closed because it has been stalled for 15 days with no activity.
System Info
GPU:L20 Tensorrt-LLM:v0.11.0 transformers: 4.42.0
Who can help?
@ncomly-nvidia @kaiyux prompt='你好,请介绍一下喜马拉雅山的详细信息'
1、transformers
about params: generation_config = GenerationConfig( top_k=1, temperature=1, max_length=2048, max_new_tokens=80, repetition_penalty=1.0, early_stopping=True, do_sample=True, num_beams=1, top_p=1, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id ) transformers result: ` 喜马拉雅山(Himalayas)是地球上最高的山脉,位于亚洲南部,横跨中国、印度、尼泊尔、不丹、巴基斯坦和阿富汗等国家。以下是关于喜马拉雅山的一些详细信息:
地理位置与范围
喜马拉雅山脉从中国西藏的喜马拉雅山脉开始,向南延伸至印度的喜马拉雅山脉,, 128 `
2、Tensorrt-LLM
about params:
batch_input_ids=input_ids, max_new_tokens=80, end_id=tokenizer.eos_token_id, pad_id=tokenizer.pad_token_id, top_k=1
Tensorrt-LLM result: ` 你好!喜马拉雅山(Himalayas)是地球上最壮观的山脉之一,位于亚洲南部,横跨中国、印度、尼泊尔、不丹、巴基斯坦和阿富汗等国家。以下是关于喜马拉雅山的一些详细信息:地理位置与范围
喜马拉雅山脉从中国西藏的喜马拉雅山脉开始,向南延伸至印度的 `
3、how to create input_ids?
` prompt='你好,请介绍一下喜马拉雅山的详细信息' messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) input_ids = tokenizer(prompt, truncation=True, return_tensors="pt", add_special_tokens=False)['input_ids'] `
4、build Qwen2-7B engine
` python convert_checkpoint.py --model_dir /mnt/qwen2/Qwen2-7B-Instruct \ --output_dir checkpoint \ --dtype float16
trtllm-build --checkpoint_dir ./checkpoint \ --output_dir ./fp16 \ --gemm_plugin float16 `
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
1、I hope qwen2 can be perfectly aligned
actual behavior
1、There are some differences in the results 2、Tested many cases, with approximately 5-10% not fully aligned
additional notes
Nothing