NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.11k stars 896 forks source link

After deployment, each request exception generates a core.xxxx file #1715

Open taorui-plus opened 3 months ago

taorui-plus commented 3 months ago

System Info

GPU: NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.3

Who can help?

@Pzzzzz5142 @fjosw @ami

Information

Tasks

Reproduction

Start the service in this directory:sherpa/triton/whisper

Expected behavior

report bug without core.xxxx files

actual behavior

generate too many core.xxxx files, each file is 2.4G. If the number of abnormal requests increases, the disk may explode easily.

additional notes

Is there any setting that can prevent the generation of core.XXXX files?

nv-guomingz commented 3 months ago

Hi @Shixiaowei02 could u please add some comments here?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."