How to load/unload model to release GPU memory using you code?

fegler / triton_server_example

NVIDIA triton server example

9 stars 0 forks source link

How to load/unload model to release GPU memory using you code? #1

Open HLH13297997663 opened 11 months ago

HLH13297997663 commented 11 months ago

Hello, this is nice work of Triton Server, I would like to ask how to load/unload model in explicit mode of triton? Do you have any relevant code or ideas? Looking forward to your reply.

fegler commented 11 months ago

I'm not sure exactly what you want, but I think this link will help. https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_management.html see the "EXPLICIT" mode. my code is using "POLL" mode now.