[Feature] CPU Core Utilization For GPU Inference Benchmark script

What is the feature?

Any python script to benchmark the performance of RTMO GPU model about the CPU cores usage, to understand the cpu load it takes for N number of input images with a batch size of 4? Please note "not" discussing here about the "inference on cpu". This is to understand the cpu core usage on jetson devices or other small gpus for the Non-gpu tasks of the inference if any.

Any other context?

Tried seeing the HTOP values in a 64 core machine - where the load average is reaching 3-4 vcpus for a batch size of 4.

open-mmlab / mmpose

[Feature] CPU Core Utilization For GPU Inference Benchmark script #3073

What is the feature?

Any other context?