Checkpointing current status for visibility.

Contains the changes to build system required to compile and link code referring to MKL functions, along with various scripts etc. to exercise those functions and gather data on the impact that they may have.

Tools and packages needed for testing MKL setting with `pytorch_inference`

Start with latest ml_linux_build Docker image (30)
Build heapcheck and gperftools and install under /usr/local/gcc103
yum install python3 (for running scripts for testing inference)

install intel-oneapi-mkl-devel-2024.0 as per linux_image Dockerfile and do:

(cd /opt/intel/oneapi/mkl/2024.0 && tar cf - include) | (cd /usr/local/gcc103 && tar xvf -)

for ps2pdf, dot etc.. (used by gperftools' pprof - see https://gperftools.github.io/gperftools/heapprofile.html)
- yum install ghostscript
- yum install graphviz
yum install libunwind - for heaptrack

Compiling the code.

Checkout the code in this PR on a linux x86_64 machine and configure CMake as normal, but ensure that pytorch_inference is linked against libtcmalloc. This can be done with e.g.

cmake -B cmake-build-relwithdebinfo  -DLINK_TCMALLOC
cmake --build cmake-build-relwithdebinfo -j`nproc` -t install

Running `pytorch_inference`

There are several python scripts in the bin/pytorch_inference directory that are capable of running pytorch_inference on various models. Examples are

python3 main.py elser_model_2_linux-x86_64.pt ../../build/distribution/platform/linux-x86_64/bin/pytorch_inference inference_requests.json --num_threads_per_allocation 8 --cache_size 274756282

python3 evaluate.py bert-base-uncased-fill-mask.pt --memory_benchmark --num_threads_per_allocation=4

These scripts can be tweaked in various ways before running. In the case of evaluate.py edit the script to:

use either heapprof (from gperftools) or heapcheck.
Alter how many inferences are requested and in how many batches.
Choose how frequently to send the mkl_free_buffers control request

Viewing results

If running pytorch_inference under heapprof there will be a reasonably large number of output files generated, e.g. /tmp/heapprof.0040.heap. These files need to be post processed by a tool called pprof e.g.:

pprof ../../build/distribution/platform/linux-x86_64/bin/pytorch_inference /tmp/heapprof.0040.heap --pdf > pytorch_inference_heapprof_0040.pdf

to generate a pdf file of the heapprof results (other output formats are available).

Heapcheck has its own GUI especially for viewing results - https://github.com/KDE/heaptrack?tab=readme-ov-file#heaptrack_gui but can also display results as plain text.

elastic / ml-cpp

[ML] Framework for testing effect of various MKL settings #2714

Tools and packages needed for testing MKL setting with `pytorch_inference`

Compiling the code.

Running `pytorch_inference`

Viewing results

elastic / ml-cpp

[ML] Framework for testing effect of various MKL settings #2714

Tools and packages needed for testing MKL setting with pytorch_inference

Compiling the code.

Running pytorch_inference

Viewing results

Tools and packages needed for testing MKL setting with `pytorch_inference`

Running `pytorch_inference`