Closed taorui-plus closed 6 months ago
这是bash launch_server.sh启动后的全部日志信息,看着没有什么问题
I0409 02:55:34.488607 25157 pinned_memory_manager.cc:275] Pinned memory pool is created at '0x7f8b7c000000' with size 2048000000 I0409 02:55:34.491989 25157 cuda_memory_manager.cc:107] CUDA memory pool is created on device 0 with size 4096000000 I0409 02:55:34.498199 25157 model_lifecycle.cc:461] loading: whisper:1 [TensorRT-LLM] TensorRT-LLM version: 0.9.0.dev2024022700 I0409 02:55:38.325204 25157 python_be.cc:2362] TRITONBACKEND_ModelInstanceInitialize: whisper_0_0 (CPU device 0) [TensorRT-LLM] TensorRT-LLM version: 0.9.0.dev2024022700 I0409 02:55:44.622592 25157 model_lifecycle.cc:827] successfully loaded 'whisper' I0409 02:55:44.622715 25157 server.cc:606] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+ I0409 02:55:44.622775 25157 server.cc:633] +---------+----------------------------------------------------+----------------------------------------------------+ | Backend | Path | Config | +---------+----------------------------------------------------+----------------------------------------------------+ | python | /opt/tritonserver/backends/python/libtriton_python | {"cmdline":{"auto-complete-config":"true","backend | | | .so | -directory":"/opt/tritonserver/backends","min-comp | | | | ute-capability":"6.000000","default-max-batch-size | | | | ":"4"}} | | | | | +---------+----------------------------------------------------+----------------------------------------------------+ I0409 02:55:44.622815 25157 server.cc:676] +---------+---------+--------+ | Model | Version | Status | +---------+---------+--------+ | whisper | 1 | READY | +---------+---------+--------+ I0409 02:55:44.672535 25157 metrics.cc:877] Collecting metrics for GPU 0: NVIDIA A10G I0409 02:55:44.680349 25157 metrics.cc:770] Collecting CPU metrics I0409 02:55:44.680501 25157 tritonserver.cc:2498] +----------------------------------+----------------------------------------------------------------------------------+ | Option | Value | +----------------------------------+----------------------------------------------------------------------------------+ | server_id | triton | | server_version | 2.42.0 | | server_extensions | classification sequence model_repository model_repository(unload_dependents) sch | | | edule_policy model_configuration system_shared_memory cuda_shared_memory binary_ | | | tensor_data parameters statistics trace logging | | model_repository_path[0] | ./model_repo_whisper_trtllm | | model_control_mode | MODE_NONE | | strict_model_config | 0 | | rate_limit | OFF | | pinned_memory_pool_byte_size | 2048000000 | | cuda_memory_pool_byte_size{0} | 4096000000 | | min_supported_compute_capability | 6.0 | | strict_readiness | 1 | | exit_timeout | 30 | | cache_enabled | 0 | +----------------------------------+----------------------------------------------------------------------------------+ I0409 02:55:44.681779 25157 grpc_server.cc:2519] Started GRPCInferenceService at 0.0.0.0:8001 I0409 02:55:44.682024 25157 http_server.cc:4623] Started HTTPService at 0.0.0.0:10086 I0409 02:55:44.723103 25157 http_server.cc:315] Started Metrics Service at 0.0.0.0:10087
执行
curl 0.0.0.0:10086
返回{"error":"Not Found"}
,没有找到其他日志信息,不知道哪个操作出了问题。
@taorui-plus curl 的用法可能不是这样的,服务启动以后你要不要试试根据 readme 里说明用 client.py 试试。client.py 用的是 tritonclient 里的 grpc。 http 和 curl 直接请求都是可以实现的,只是现在没有支持。
问题解决了,感谢大佬的解答
这是bash launch_server.sh启动后的全部日志信息,看着没有什么问题
执行
curl 0.0.0.0:10086
返回{"error":"Not Found"}
,没有找到其他日志信息,不知道哪个操作出了问题。