When I try to analyze my ensemble I get this error:
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 128, in profile
self._profile_models()
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 247, in _profile_models
self._model_manager.run_models(models=[model])
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/model_manager.py", line 151, in run_models
self._stop_ma_if_no_valid_measurement_threshold_reached()
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/model_manager.py", line 245, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.
Command:
perf_analyzer -m model_mood_ensemble -b 1 -u 10.111.13.85:8001 -i grpc -f model_mood_ensemble-results.csv --verbose-csv --concurrency-range 4 --input-data perf_analyzer_data.json --shape link_to_file:1 --measurement-mode count_windows --collect-metrics --metrics-url http://10.111.13.85:8002/metrics --metrics-interval 1000
Error: perf_analyzer did not produce any output. It was likely terminated with a SIGABRT.
Command:
perf_analyzer -m model_mood_ensemble -b 1 -u 10.111.13.85:8001 -i grpc -f model_mood_ensemble-results.csv --verbose-csv --concurrency-range 2 --input-data perf_analyzer_data.json --shape link_to_file:1 --measurement-mode count_windows --collect-metrics --metrics-url http://10.111.13.85:8002/metrics --metrics-interval 1000
Error: perf_analyzer did not produce any output. It was likely terminated with a SIGABRT.
MA logs:
[Model Analyzer] Initializing GPUDevice handles
[Model Analyzer] Using GPU 0 NVIDIA A40 with UUID GPU-b84a3bad-5448-2ad9-3780-f0261c1d1eac
[Model Analyzer] Using GPU 1 NVIDIA A40 with UUID GPU-4aad5a35-99bd-90bd-b0e6-11bac59e302c
[Model Analyzer] WARNING: Overriding the output model repo path "/home/triton-server/profiling/output_model_repository"
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer] No checkpoint file found, starting a fresh run.
[Model Analyzer] WARNING: A model not being profiled (custom_mood_ensemble) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (file_downloader) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (model) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (model_classifier) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (model_classifier_postprocessor) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (model_postprocessor) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] WARNING: A model not being profiled (model_preprocessor) is loaded on the remote Tritonserver. This could impact the profile results.
[Model Analyzer] Profiling server only metrics...
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer] Using remote Triton Server
[Model Analyzer] WARNING: GPU memory metrics reported in the remote mode are not accurate. Model Analyzer uses Triton explicit model control to load/unload models. Some frameworks do not release the GPU memory even when the memory is not being used. Consider using the "local" or "docker" mode if you want to accurately monitor the GPU memory usage for different models.
[Model Analyzer]
[Model Analyzer] Starting quick mode search to find optimal configs
[Model Analyzer]
[Model Analyzer] Creating model config: file_downloader_config_default
[Model Analyzer]
[Model Analyzer] Creating model config: model_preprocessor_config_default
[Model Analyzer]
[Model Analyzer] Creating model config: model_config_default
[Model Analyzer]
[Model Analyzer] Creating model config: model_postprocessor_config_default
[Model Analyzer]
[Model Analyzer] Creating ensemble model config: model_mood_ensemble_config_default
[Model Analyzer] Profiling model_mood_ensemble_config_default: concurrency=4
[Model Analyzer] WARNING: CPU metrics are being collected. This can affect the latency or throughput numbers reported by perf analyzer.
[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer
[Model Analyzer] perf_analyzer did not produce any output.
[Model Analyzer] Saved checkpoint to /home/triton-server/profiling/checkpoints/0.ckpt
[Model Analyzer] Creating model config: file_downloader_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_CPU'}]
[Model Analyzer]
[Model Analyzer] Creating model config: model_preprocessor_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_CPU'}]
[Model Analyzer] Setting max_batch_size to 1
[Model Analyzer] Enabling dynamic_batching
[Model Analyzer]
[Model Analyzer] Creating model config: model_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
[Model Analyzer]
[Model Analyzer] Creating model config: model_postprocessor_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_CPU'}]
[Model Analyzer]
[Model Analyzer] Creating ensemble model config: model_mood_ensemble_config_0
[Model Analyzer] Setting max_batch_size to 1
[Model Analyzer] Profiling model_mood_ensemble_config_0: concurrency=2
[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer
[Model Analyzer] perf_analyzer did not produce any output.
[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
Traceback (most recent call last):
File "/usr/local/bin/model-analyzer", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 128, in profile
self._profile_models()
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/analyzer.py", line 247, in _profile_models
self._model_manager.run_models(models=[model])
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/model_manager.py", line 151, in run_models
self._stop_ma_if_no_valid_measurement_threshold_reached()
File "/usr/local/lib/python3.10/dist-packages/model_analyzer/model_manager.py", line 245, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.
Moreover, according to the tritonserver logs, the ensemble receives requests and processes them.
Also when I tried to run PA separately, there was a similar problem. Maybe the problem is in it
MA version 1.41.0
Release 2.47.0 corresponding to NGC container 24.06
When I try to analyze my ensemble I get this error:
My command:
My config in
config.yaml
:PA logs (perf_analyzer_error.log):
MA logs:
Moreover, according to the tritonserver logs, the ensemble receives requests and processes them. Also when I tried to run PA separately, there was a similar problem. Maybe the problem is in it
MA version 1.41.0 Release 2.47.0 corresponding to NGC container 24.06