isarsoft / yolov4-triton-tensorrt

This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
http://www.isarsoft.com
Other
276 stars 63 forks source link

error: creating server: Internal - failed to load all models #32

Closed KevenLee closed 3 years ago

KevenLee commented 3 years ago

I encountered some errors when I was running: I create "triton-deploy" folder in the same directory as "yolov4-triton-tensorrt", such as: triton-deploy yolov4-triton-tensorrt

triton-deploy/models/yolov4/1 model.plan triton-deploy/plugins liblayerplugin.so

I run "docker run --gpus all --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v$(pwd)/triton-deploy/models:/models -v$(pwd)/triton-deploy/plugins:/plugins --env LD_PRELOAD=/plugins/liblayerplugin.so nvcr.io/nvidia/tritonserver:20.10-py3 tritonserver --model-repository=/models --strict-model-config=false --grpc-infer-allocation-pool-size=16 --log-verbose 1 ", then I got those error:

== Triton Inference Server ==

NVIDIA Release 20.10 (build )

Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. All rights reserved. NVIDIA modifications are covered by the license terms that apply to the underlying project or file.

I0517 06:20:39.062929 1 metrics.cc:184] found 4 GPUs supporting NVML metrics I0517 06:20:39.068694 1 metrics.cc:193] GPU 0: GeForce RTX 2080 Ti I0517 06:20:39.074545 1 metrics.cc:193] GPU 1: GeForce RTX 2080 Ti I0517 06:20:39.080262 1 metrics.cc:193] GPU 2: GeForce RTX 2080 Ti I0517 06:20:39.086189 1 metrics.cc:193] GPU 3: GeForce RTX 2080 Ti I0517 06:20:39.343302 1 pinned_memory_manager.cc:195] Pinned memory pool is created at '0x7ff9e6000000' with size 268435456 I0517 06:20:39.345681 1 cuda_memory_manager.cc:98] CUDA memory pool is created on device 0 with size 67108864 I0517 06:20:39.345690 1 cuda_memory_manager.cc:98] CUDA memory pool is created on device 1 with size 67108864 I0517 06:20:39.345694 1 cuda_memory_manager.cc:98] CUDA memory pool is created on device 2 with size 67108864 I0517 06:20:39.345697 1 cuda_memory_manager.cc:98] CUDA memory pool is created on device 3 with size 67108864 W0517 06:20:39.633003 1 server.cc:235] failed to enable peer access for some device pairs I0517 06:20:39.633077 1 netdef_backend_factory.cc:46] Create NetDefBackendFactory I0517 06:20:39.633086 1 plan_backend_factory.cc:48] Create PlanBackendFactory I0517 06:20:39.633091 1 plan_backend_factory.cc:55] Registering TensorRT Plugins I0517 06:20:39.633160 1 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1 I0517 06:20:39.633175 1 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1 I0517 06:20:39.633185 1 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1 I0517 06:20:39.633194 1 logging.cc:52] Registered plugin creator - ::Region_TRT version 1 I0517 06:20:39.633203 1 logging.cc:52] Registered plugin creator - ::Clip_TRT version 1 I0517 06:20:39.633211 1 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1 I0517 06:20:39.633221 1 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1 I0517 06:20:39.633232 1 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1 I0517 06:20:39.633242 1 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1 I0517 06:20:39.633253 1 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1 I0517 06:20:39.633262 1 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1 I0517 06:20:39.633271 1 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1 I0517 06:20:39.633280 1 logging.cc:52] Registered plugin creator - ::CropAndResize version 1 I0517 06:20:39.633288 1 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1 I0517 06:20:39.633297 1 logging.cc:52] Registered plugin creator - ::Proposal version 1 I0517 06:20:39.633306 1 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1 I0517 06:20:39.633315 1 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1 I0517 06:20:39.633329 1 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1 I0517 06:20:39.633336 1 logging.cc:52] Registered plugin creator - ::Split version 1 I0517 06:20:39.633343 1 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1 I0517 06:20:39.633352 1 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1 I0517 06:20:39.633364 1 libtorch_backend_factory.cc:53] Create LibTorchBackendFactory I0517 06:20:39.633374 1 custom_backend_factory.cc:46] Create CustomBackendFactory I0517 06:20:39.633379 1 backend_factory.h:44] Create TritonBackendFactory I0517 06:20:39.633396 1 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory I0517 06:20:39.633522 1 autofill.cc:142] TensorFlow SavedModel autofill: Internal: unable to autofill for 'yolov4', unable to find savedmodel directory named 'model.savedmodel' I0517 06:20:39.633546 1 autofill.cc:155] TensorFlow GraphDef autofill: Internal: unable to autofill for 'yolov4', unable to find graphdef file named 'model.graphdef' I0517 06:20:39.633568 1 autofill.cc:168] PyTorch autofill: Internal: unable to autofill for 'yolov4', unable to find PyTorch file named 'model.pt' I0517 06:20:39.633592 1 autofill.cc:180] Caffe2 NetDef autofill: Internal: unable to autofill for 'yolov4', unable to find netdef files: 'model.netdef' and 'init_model.netdef' I0517 06:20:39.633622 1 autofill.cc:212] ONNX autofill: Internal: unable to autofill for 'yolov4', unable to find onnx file or directory named 'model.onnx' E0517 06:20:55.232289 1 logging.cc:43] coreReadArchive.cpp (41) - Serialization Error in verifyHeader: 0 (Version tag does not match. Note: Current Version: 96, Serialized Engine Version: 89) E0517 06:20:55.232422 1 logging.cc:43] INVALID_STATE: std::exception E0517 06:20:55.232434 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed. I0517 06:20:55.233322 1 autofill.cc:225] TensorRT autofill: Internal: unable to autofill for 'yolov4', unable to find a compatible plan file. W0517 06:20:55.233338 1 autofill.cc:265] Proceeding with simple config for now I0517 06:20:55.233346 1 model_config_utils.cc:637] autofilled config: name: "yolov4"

E0517 06:20:55.233899 1 model_repository_manager.cc:1604] unexpected platform type for yolov4 I0517 06:20:55.233944 1 server.cc:141] +---------+--------+------+ | Backend | Config | Path | +---------+--------+------+ +---------+--------+------+

I0517 06:20:55.233951 1 model_repository_manager.cc:469] BackendStates() I0517 06:20:55.233960 1 server.cc:184] +-------+---------+--------+ | Model | Version | Status | +-------+---------+--------+ +-------+---------+--------+

I0517 06:20:55.234043 1 tritonserver.cc:1621] +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.4.0 | | server_extensions | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics | | model_repository_path[0] | /models | | model_control_mode | MODE_NONE | | strict_model_config | 0 | | pinned_memory_pool_byte_size | 268435456 | | cuda_memory_pool_byte_size{0} | 67108864 | | cuda_memory_pool_byte_size{1} | 67108864 | | cuda_memory_pool_byte_size{2} | 67108864 | | cuda_memory_pool_byte_size{3} | 67108864 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+

I0517 06:20:55.234053 1 server.cc:280] Waiting for in-flight requests to complete. I0517 06:20:55.234060 1 model_repository_manager.cc:435] LiveBackendStates() I0517 06:20:55.234064 1 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models

KevenLee commented 3 years ago

Sorry, I have found the reason. The nvcr.io/nvidia/tensorrt version should be the same as the nvcr.io/nvidia/tritonserver version, for example, both are 20.10-py3