intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.7k stars 1.26k forks source link

How to profile IPEX-llm application to turn performance from top to bottom #11485

Open lucshi opened 4 months ago

lucshi commented 4 months ago

I'd like to use VTune to profiel IPEX-llm application focusing on GPU. e.g. the performance/all-in-one benchmark to get a full picture of bottleneck. My questions are:

  1. General guide to use VTune to profile IPEX application.
  2. Which OS shall I choose for most profiling details? Whether Windows have some limitation like "layers level profiling is not available".
  3. How to map the source code with the possible profiling bottleneck result.
leonardozcm commented 4 months ago

hi @lucshi

I'd like to use VTune to profiel IPEX application focusing on GPU. e.g. the performance/all-in-one benchmark to get a full picture of bottleneck. My questions are:

  1. General guide to use VTune to profile IPEX application.

You may refer to this doc https://www.intel.com/content/www/us/en/docs/oneapi/optimization-guide-gpu/2023-0/gpu-analysis-with-vtunetm-profiler.html and their cookbook of gpu https://www.intel.com/content/www/us/en/docs/vtune-profiler/cookbook/2024-2/profiling-dpc-application.html

  1. Which OS shall I choose for most profiling details? Whether Windows have some limitation like "layers level profiling is not available".

Both linux and window are ok. If you want a "layers level profiling", how about a torch profiler?

  1. How to map the source code with the possible profiling bottleneck result.

In theory, you can refer to this to do it. You can give it a try. https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/viewing-source.html