triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.39k stars 1.49k forks source link

test: Load new model version should not reload loaded existing model version(s) #7527

Closed kthui closed 3 months ago

kthui commented 3 months ago

What does the PR do?

Add the following tests on model version reload:

  1. If a version is loaded and unmodified, then it should not be reloaded on the next load request.
  2. If a version is loaded but modified, then it should be reloaded on the next load request.
  3. If a version is not loaded and its model file is in the model directory, then it should be loaded on the next load request.
  4. If a version is not loaded nor its model file is in the model directory, then it should be loaded on the next load request.
  5. If a generic file is modified on the model directory, i.e. model_directory/common_dependency.py, then all loaded version(s) should be reloaded on the next load request.

Checklist

Commit Type:

Check the conventional commit type box here and add the label to the github PR.

Related PRs:

https://github.com/triton-inference-server/core/pull/388

Where should the reviewer start?

Start with the core PR.

Test plan:

The PR adds the test, see the Test plan on the core PR.

Caveats:

N/A

Background

N/A

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

N/A