Open kadisyy opened 3 months ago
I just want to confirm a question:
my model has 1000 params, but the infer duration just about 20us , i use promSql like this:
`avg(rate(nv_inference_compute_infer_duration_us{app=~"$app", env="${env}"}[2m]) / rate(nv_inference_count{app=~"$app", env="${env}"}[2m])) by (model, instance_name)
`
I feel the duration is too short;
it is true?
Hi @kadisyy, it is a bit hard to answer the question without the context.
I would recommend timing your model's run outside the triton to verify timing you have.
I just want to confirm a question:
my model has 1000 params, but the infer duration just about 20us , i use promSql like this:
`avg(rate(nv_inference_compute_infer_duration_us{app=~"$app", env="${env}"}[2m]) / rate(nv_inference_count{app=~"$app", env="${env}"}[2m])) by (model, instance_name)
`
I feel the duration is too short;
it is true?