List of composing models that should be profiled using CPU instances only

cpu_only_composing_models:

Can only specify --cpu-only-composing-models for ensemble or BLS models.??

Not able to use as command line options - But able to use it in yaml

Specifies which metric(s) are to be collected.

[ collect_cpu_metrics: | default: false ]

model-analyzer: error: unrecognized arguments: --collect_cpu_metrics
model-analyzer: error: unrecognized arguments: --collect_cpu_metrics true

model-analyzer profile --triton-launch-mode remote --client-protocol grpc --triton-grpc-endpoint x.xx.xx.xx:8001 --triton-metrics-url http://x.xx.xx.xx:8002/metrics --profile-models model1 --output-model-repository-path out_models --override-output-model-repository

I don't have GPU in my system:

[Model Analyzer] Creating model config: model1_config_0
[Model Analyzer]   Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
[Model Analyzer] 
[Model Analyzer] Model model1_config_0 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'model1': version 1 is at READY state: Invalid argument: instance group model1_0 of model model1 has kind KIND_GPU but server does not support GPUs;

How in quick search mode it will take KIND_CPU instance only. How to make sure that model runs only on cpu instance.

How to avoid gpu metrics warnings also

nv-braf commented 3 weeks ago

When on the CLI the command is --collect-cpu-metrics (dashes instead of underscores).

Kanupriyagoyal commented 3 weeks ago

#805 Questions regarding config search

In this 2nd point. Always getting instance as kind_cpu. In my case its happening reverse as I am getting 'KIND_GPU', which parameter i can pass in command line to make sure it will create KIND_CPU instance

found some PR https://github.com/triton-inference-server/model_analyzer/pull/806

Running model analyzer with:

model-analyzer profile --triton-launch-mode remote --client-protocol grpc --triton-grpc-endpoint {my_ip}:{port} --triton-metrics-url http://{my_ip}:{port}/metrics --profile-models snapml_model --gpus [''] --output-model-repository-path out_models --override-output-model-repository

**[Model Analyzer] No GPUs requested**
[Model Analyzer] Creating model config: snapml_model_config_0
[Model Analyzer]   Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
[Model Analyzer]   Setting max_batch_size to 1
[Model Analyzer]   Enabling dynamic_batching
[Model Analyzer] 
[Model Analyzer] Model snapml_model_config_0 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'snapml_model': version 1 is at UNAVAILABLE state: Invalid argument: instance group snapml_model_0 of model snapml_model has kind KIND_GPU but server does not support GPUs;

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
[Model Analyzer] Creating model config: snapml_model_config_1
[Model Analyzer]   Setting instance_group to [{'count': 2, 'kind': 'KIND_GPU'}]
[Model Analyzer]   Setting max_batch_size to 1
[Model Analyzer]   Enabling dynamic_batching
[Model Analyzer] 
[Model Analyzer] Model snapml_model_config_1 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'snapml_model': version 1 is at UNAVAILABLE state: Invalid argument: instance group snapml_model_0 of model snapml_model has kind KIND_GPU but server does not support GPUs;

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
[Model Analyzer] Creating model config: snapml_model_config_2
[Model Analyzer]   Setting instance_group to [{'count': 3, 'kind': 'KIND_GPU'}]
[Model Analyzer]   Setting max_batch_size to 1
[Model Analyzer]   Enabling dynamic_batching
[Model Analyzer] 
[Model Analyzer] Model snapml_model_config_2 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'snapml_model': version 1 is at UNAVAILABLE state: Invalid argument: instance group snapml_model_0 of model snapml_model has kind KIND_GPU but server does not support GPUs;

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
[Model Analyzer] Creating model config: snapml_model_config_3
[Model Analyzer]   Setting instance_group to [{'count': 4, 'kind': 'KIND_GPU'}]
[Model Analyzer]   Setting max_batch_size to 1
[Model Analyzer]   Enabling dynamic_batching
[Model Analyzer] 
[Model Analyzer] Model snapml_model_config_3 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'snapml_model': version 1 is at UNAVAILABLE state: Invalid argument: instance group snapml_model_0 of model snapml_model has kind KIND_GPU but server does not support GPUs;

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
[Model Analyzer] Creating model config: snapml_model_config_4
[Model Analyzer]   Setting instance_group to [{'count': 5, 'kind': 'KIND_GPU'}]
[Model Analyzer]   Setting max_batch_size to 1
[Model Analyzer]   Enabling dynamic_batching
[Model Analyzer] 
[Model Analyzer] Model snapml_model_config_4 load failed: [StatusCode.INVALID_ARGUMENT] load failed for model 'snapml_model': version 1 is at UNAVAILABLE state: Invalid argument: instance group snapml_model_0 of model snapml_model has kind KIND_GPU but server does not support GPUs;

[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
[Model Analyzer] 
[Model Analyzer] Done with brute mode search.

Kanupriyagoyal commented 1 week ago

@nv-braf is there any option to pass in command line so that it can take KIND_CPU instance only!!

nv-braf commented 1 week ago

Not on the command line, but you can specify cpu_only as a flag to the model in the YAML file. Please see our documentation for an example on how to do this: https://github.com/triton-inference-server/model_analyzer/blob/main/docs/config.md#cpu_only

Kanupriyagoyal commented 1 week ago

@nv-braf Thanks cpu_only flag i had tried and working fine. I was looking for command line option.

triton-inference-server / model_analyzer

Model_analyzer 1.42 creating config with kind_gpu only #927

List of composing models that should be profiled using CPU instances only

Specifies which metric(s) are to be collected.