triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.23k stars 1.47k forks source link

nv_inference_count no longer includes gpu_uuid? #7479

Open chriscarollo opened 2 months ago

chriscarollo commented 2 months ago

I have some grafana graphs using Triton's prometheus metrics, and it appears that in a semi-recent update that nv_inference_count no longer includes a gpu_uuid field (I see only "model" and "version"). I have a graph showing the number of inferences per gpu, which no longer works.

rmccorm4 commented 2 months ago

Hi @chriscarollo, have you used the tritonserver --model-control-mode EXPLICIT ... (or POLL) feature to dynamically load/unload models before? I believe there may be a known inconsistency where models loaded at startup have no GPU_ID label for non-GPU metrics, and models dynamically loaded later on after server has started do have these GPU_ID labels applied to other non-GPU related metrics.

Please let me know if you can consistently identify or reproduce this behavior one way or the other.

chriscarollo commented 2 months ago

I'm actually using model-control-mode POLL and it does appear that my gpu_id labels did come back after it detected new versions. So it does look like maybe only an issue on initial startup?

rmccorm4 commented 2 months ago

Hi @chriscarollo, this is a known issue and has a proposed resolution in this PR: https://github.com/triton-inference-server/core/pull/321. Please chime in on the discussion with your use case, impact, etc.