Open riyajatar37003 opened 4 months ago
--triton-launch-mode=remote
tells model analyzer to not launch tritonserver. The expectation is that there is already a server up and running (usually on a different machine).
what is this then
This mode is beneficial when you want to use an already running Triton Inference Server. You may provide the URLs for the Triton instance's HTTP or GRPC endpoint depending on your chosen client protocol using the --triton-grpc-endpoint, and --triton-http-endpoint flags. You should also make sure that same GPUs are available to the Inference Server and Model Analyzer and they are on the same machine. Triton Server in this mode needs to be launched with --model-control-mode explicit flag to support loading/unloading of the models.
now i am getting this error
Model Analyzer] Initializing GPUDevice handles
[Model Analyzer] Using GPU 0 NVIDIA A100-SXM4-40GB with UUID GPU-d9a0447f-f8fa-9d2f-79fc-ecf2567dacc2
[Model Analyzer] WARNING: Overriding the output model repo path "./rerenker_output1"
[Model Analyzer] Starting a local Triton Server
[Model Analyzer] Loaded checkpoint from file /model_repositories/checkpoints/0.ckpt
[Model Analyzer] GPU devices match checkpoint - skipping server metric acquisition
[Model Analyzer]
[Model Analyzer] Starting quick mode search to find optimal configs
[Model Analyzer]
[Model Analyzer] Creating model config: reranker_config_default
[Model Analyzer]
[Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_default
[Model Analyzer]
[Model Analyzer] Profiling reranker_config_default: client batch size=1, concurrency=24
[Model Analyzer] Profiling bge_reranker_v2_onnx_config_default: client batch size=1, concurrency=8
[Model Analyzer]
[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer
[Model Analyzer] perf_analyzer did not produce any output.
[Model Analyzer] Saved checkpoint to model_repositories/checkpoints/1.ckpt
[Model Analyzer] Creating model config: reranker_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
[Model Analyzer] Setting max_batch_size to 1
[Model Analyzer] Enabling dynamic_batching
[Model Analyzer]
[Model Analyzer] Creating model config: bge_reranker_v2_onnx_config_0
[Model Analyzer] Setting instance_group to [{'count': 1, 'kind': 'KIND_GPU'}]
[Model Analyzer] Setting max_batch_size to 1
[Model Analyzer] Enabling dynamic_batching
[Model Analyzer]
[Model Analyzer] Profiling reranker_config_0: client batch size=1, concurrency=2
[Model Analyzer] Profiling bge_reranker_v2_onnx_config_0: client batch size=1, concurrency=2
[Model Analyzer]
[Model Analyzer] perf_analyzer took very long to exit, killing perf_analyzer
[Model Analyzer] perf_analyzer did not produce any output.
[Model Analyzer] No changes made to analyzer data, no checkpoint saved.
Traceback (most recent call last):
File "/opt/app_venv/bin/model-analyzer", line 8, in
sys.exit(main())
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/entrypoint.py", line 278, in main
analyzer.profile(
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 124, in profile
self._profile_models()
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/analyzer.py", line 233, in _profile_models
self._model_manager.run_models(models=models)
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 145, in run_models
self._stop_ma_if_no_valid_measurement_threshold_reached()
File "/opt/app_venv/lib/python3.10/site-packages/model_analyzer/model_manager.py", line 239, in _stop_ma_if_no_valid_measurement_threshold_reached
raise TritonModelAnalyzerException(
model_analyzer.model_analyzer_exceptions.TritonModelAnalyzerException: The first 2 attempts to acquire measurements have failed. Please examine the Tritonserver/PA error logs to determine what has gone wrong.
MA is not receiving a measurement from Perf Analyzer within the timeout window (600s). After two attempts without measurements, MA exits and directs you to examine the error logs to determine what has gone wrong. There can be a variety of reasons why this is occurring. Please examine the PA error log for additional details.
Hi ,
can u share any example/command for these mode.?
during launching i am doing this way "tritonserver --model-control-mode explicit --exit-on-error=false --model-repository=/tmp/models"
and in the other container i am running this " model-analyzer profile \ --profile-models reranker --triton-launch-mode=remote \ --output-model-repository-path ./output \ --export-path profile_results--triton-http-endpoint "
but triton-server itself not launching