triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.08k stars 1.45k forks source link

Allow for explicit folder name when specifying where remote model repository will be downloaded. #6644

Open markthill opened 9 months ago

markthill commented 9 months ago

Is your feature request related to a problem? Please describe. When a model repository is downloaded from a remote location there are possible references to these files that are needed to be explicitly provided. Currently the remote repository will use the TRITON_AWS_MOUNT_DIRECTORY (AWS implementation) and download the files to the location you specify, but it will then place those files in a randomly named folder with a convention similar to folderXXXXXX. Placing the files in a randomly named folder defeats the purpose of specifying an explicit directory name because it can no longer be referenced in your files.

Describe the solution you'd like Either remove the folderXXXXXX that is created and place the files exactly in the TRITON_AWS_MOUNT_DIRECTORY or provide another environment variable for the explicit folder name to override the folderXXXXXX naming convention.

Describe alternatives you've considered

  1. Using the Triton server as a base container where there is an initialization process added to download the remote repository to a specific location.
  2. Using an init container to load to explicit location

Additional context N/A

kthui commented 9 months ago

Thanks for suggesting improvements. I have added a ticket for us to investigate further.

shixianc commented 8 months ago

Hi is there any update on this feature? This is quite useful for loading large LLM from s3.

danielchalef commented 8 months ago

The tensorrt-llm backend requires setting the gpt_model_path. This can't be relative and fails with S3-based model repos. Any update on this @kthui?

dyastremsky commented 7 months ago

Apologies for the delay and thank you for following up. I flagged the ticket so that we can look at prioritizing it.

jadhosn commented 6 months ago

@dyastremsky another interested customer here, any updates about prioritizing this feature?

dyastremsky commented 6 months ago

Thanks for checking in. Checked with folks about prioritization.

Alexis-Jacob commented 4 months ago

Hi Do you have an update on this feature to share with us?

dyastremsky commented 4 months ago

There's some work being done that this feature depends on. Work on it is resuming.

Once that is merged, this feature can be worked on.

amakaido28 commented 2 weeks ago

Is there any news about this topic? The ENV "TRITON_AWS_MOUNT_DIRECTORY" is still continuing to create folderXXXXXX. Do you know how to can I load my model repository from minIO S3?