triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.4k stars 1.49k forks source link

Minio model repository stuck on downloading files with <=2.31.0 #5501

Open wesselvdv opened 1 year ago

wesselvdv commented 1 year ago

Description Having a minio model repository is causing an endless flurry of the following messages in the log:

I0314 10:24:23.004897 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.100826 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.181048 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.277722 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.372715 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.449236 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.545400 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.641129 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.718086 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.821001 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.916721 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:23.996569 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.092640 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.188223 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.221724 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.317352 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.417407 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.493051 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.601205 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.701305 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.778099 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.877257 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:24.976695 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.057218 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.156972 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.254228 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.365740 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.465607 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.590462 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.670828 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.769295 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.869778 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:25.948788 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.048503 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.144634 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.220952 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.320385 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.420665 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.496151 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.592215 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.688581 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.765461 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.872867 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:26.969036 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.044941 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.141840 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.241035 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.316899 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.413156 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.508900 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.594048 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.689367 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.785372 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.865454 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:27.961763 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.061957 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.137164 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.233381 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.328748 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.406209 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.506801 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.605555 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.681084 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.779214 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.877331 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:28.956658 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.058719 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.160362 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.241053 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.350630 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.456888 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.538849 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.641618 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.766596 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.852873 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:29.953860 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.054182 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.129003 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.225690 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.320839 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.397537 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.497835 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.596737 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.674031 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.772531 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/
I0314 10:24:30.881500 23 filesystem.cc:2354] Using credential    for path  s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/

Triton Information We're using the latest container version. (23.02)

To Reproduce Try and use a minio bucket with control_mode=none, and it'll be stuck on startup on the same file endlessly.

Expected behavior Should startup normally, and load all the models successfully. The model configuration is correct, as I've worked around this issue in previous versions by downloading all the configuration locally on startup, and pointing triton to the local directory.

Tabrizian commented 1 year ago

@kthui Do you know what could be the issue here?

wesselvdv commented 1 year ago

Is there some more debug I can turn on to expedite this? This is the first version where I am able to see more than it's just hanging.

kthui commented 1 year ago

Hi @wesselvdv, could you give the following a try to help us narrow down the cause of the issue?

  1. Can you quickly validate the s3://https://<snip>:443/p-triton/wav2vec/config.pbtxt/ on minio is actually a file but not a directory?
  2. Can you try setting --model-control-mode=explicit when starting the server, and then load a small model (small in file size) using the load API from the client, and see if the hang is still replicable? It is possible the model(s) just takes a long time to be transmitted over the network.
wesselvdv commented 1 year ago

Hi @kthui I've tried your second suggestion and that does seem to work! That didn't work in a previous version (it would hang indefinitely), but it does now.

krishung5 commented 1 year ago

Glad it works! I'm closing this issue but please let us know if you would like follow-up and we will reopen the ticket.

wesselvdv commented 1 year ago

Seems to be broken again unfortunately, not sure what caused it to hang again. We didn't change anything in our setup.

wesselvdv commented 1 year ago

Are there any other steps we can take (e.g. enable more debug)?

krishung5 commented 1 year ago

@wesselvdv Running Trion with logging enabled should provide more context: tritonserver ... ... --log-verbose=1. You can also try to build with debug symbols if you'd like to run with gdb.

wesselvdv commented 1 year ago

@wesselvdv Running Trion with logging enabled should provide more context: tritonserver ... ... --log-verbose=1. You can also try to build with debug symbols if you'd like to run with gdb.

I had already put the debug on 2, assuming that info also shows warning (1). I'll have a look into gdb, not sure if I can do a remote session with that.

kthui commented 1 year ago

I think it could be the underlying S3 client (or minio server) is misreporting a file as a directory to Triton, which caused the infinity loop. I have already filed a ticket for us to investigate further.

wesselvdv commented 1 year ago

@kthui Let me know if there's something I can do to help!