Unit tests have been added primarily within triton_cli/tests/test_cli.py for triton metrics, triton config, and triton status.
Brief overview of the tests:
For metrics:
From the test_models repository, mock_llm is loaded.
Individual inferences are performed on each model
triton metrics output checked to see if nv_inference_request_success reflects successful inference.
For config:
From the test_models repository, add_sub and mock_llm are loaded.
triton config -m model_name is called.
The name field of the returned json is cross-referenced with the original model_name
For status:
From the test_models repository, add_sub and mock_llm are loaded.
triton status is called and verifies from output that the models are live and ready.
Secondary Changes that were made while implementing the tests:
Adding grpcio>=1.64.0 to pyproject.toml.
Standardizing JSON outputs returned by triton_cli: Triton commands were previously returning keys in single quotes and boolean values as {True, False} instead of {true,false}. This leads to errors when the output is attempted to be parsed as a json.
This PR is raised to address Jira Ticket [DLIS-6264].
Unit tests have been added primarily within
triton_cli/tests/test_cli.py
fortriton metrics
,triton config
, andtriton status
.Brief overview of the tests: For metrics:
test_models
repository,mock_llm
is loaded.triton metrics
output checked to see ifnv_inference_request_success
reflects successful inference.For config:
test_models
repository,add_sub
andmock_llm
are loaded.triton config -m model_name
is called.name
field of the returnedjson
is cross-referenced with the originalmodel_name
For status:
test_models
repository,add_sub
andmock_llm
are loaded.triton status
is called and verifies from output that the models arelive
andready
.Secondary Changes that were made while implementing the tests:
grpcio>=1.64.0
topyproject.toml
.triton_cli
: Triton commands were previously returning keys in single quotes and boolean values as{True, False}
instead of{true,false}
. This leads to errors when the output is attempted to be parsed as a json.