caikit / caikit-nlp

Apache License 2.0
12 stars 45 forks source link

Add graph mode to embedding models which benefits from ipex optimize #359

Open devpramod opened 3 months ago

devpramod commented 3 months ago

Graph mode (torchscript) is an additional step that can further accelerate a workload that has been optimized with ipex and bfloat16 along with mixed precision. Graph mode utilizes AMX on SPR for additional speedups.

This PR modifies config file to have graphmode as an option to end users The graph is compiled in the constructor of the embedding module such that it is ready for execution when requests arrive.

devpramod commented 3 months ago

Hi @gkumbhat

  1. Enabling graph mode is most applicable for models that need high performance and efficiency in production. A wide range of PyTorch operations are covered in TorchScript's graph mode. It is not recommended in cases where there might be complex post processing involving custom python libraries as part of the model code.

  2. The speedups are dependent on the workload at hand. More information can be found here - https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/ On Intel hardware, using torchscript along with ipex optimize uses AMX an ISA with instructions to speed up AI workloads.

working on addressing 1 & 2

devpramod commented 3 months ago

Hi @gkumbhat I have resolved the formatting, linting and DCO check issues