Open yao-matrix opened 2 weeks ago
- A100: 20.21
- Gaudi 2: 36.79
python ./examples/text-generation/run_generation.py --model_name_or_path deepseek-ai/DeepSeek-V2-Lite --use_kv_cache --max_new_tokens 100 --batch_size 1 --bf16 --use_hpu_graphs --prompt "DeepSpeed is a machine learning framework"
- 2x: 57.96
- 4x: 84.14
- 8x: 109.76
python ./examples/gaudi_spawn.py --world_size=8 ./examples/text-generation/run_generation.py --model_name_or_path deepseek-ai/DeepSeek-V2-Lite --use_kv_cache --max_new_tokens 100 --batch_size 1 --bf16 --use_hpu_graphs --parallel_strategy "ep" --prompt "DeepSpeed is a machine learning framework"
@libinta @sywangyi , pls help review, thx.
@libinta , pls help review,thx.
enable DeepSeek-V2, includes: