Closed Duyi-Wang closed 8 months ago
please using qwen_convert.py to re-convert the qwen model. new parameter "seq_length" add for logN and NTK rotary.
I also will commit a PR to check the parameter and rise a warning if "seq_length" is unset.
Fixed by add seq_length into config.ini
This issue occurs after this PR https://github.com/intel/xFasterTransformer/pull/215 merged. Reproduce cmd:
bash ./run_benchmark.sh -m qwen-7b -d bf16 -s 1 -bs 1 -in 4096 -out 2048 -i 1