Open chriscarollo opened 2 months ago
Hi @chriscarollo, have you used the tritonserver --model-control-mode EXPLICIT ...
(or POLL
) feature to dynamically load/unload models before? I believe there may be a known inconsistency where models loaded at startup have no GPU_ID label for non-GPU metrics, and models dynamically loaded later on after server has started do have these GPU_ID labels applied to other non-GPU related metrics.
Please let me know if you can consistently identify or reproduce this behavior one way or the other.
I'm actually using model-control-mode POLL and it does appear that my gpu_id labels did come back after it detected new versions. So it does look like maybe only an issue on initial startup?
Hi @chriscarollo, this is a known issue and has a proposed resolution in this PR: https://github.com/triton-inference-server/core/pull/321. Please chime in on the discussion with your use case, impact, etc.
I have some grafana graphs using Triton's prometheus metrics, and it appears that in a semi-recent update that nv_inference_count no longer includes a gpu_uuid field (I see only "model" and "version"). I have a graph showing the number of inferences per gpu, which no longer works.