Open wufxgtihub123 opened 4 days ago
--disable-log-stats
and --disable-log-requests
to not disable all logging, they disable the logging of stats and request contents respectively. They work as intended.
temp/ray
is written to by ray
. This is disabled by setting ray.init(log_to_driver=False)
, which would be done here:
Your current environment
🐛 Describe the bug
When I start the vllm inference service in openai format from the command line “”“CUDA_VISIBLE_DEVICES=1 python -m vllm.entrypoints.openai.api_server \ --model /date/pretrained_models/Qwen1.5-14B-Chat \ --trust-remote-code \ --served-model-name qwen7b \ --api-key sk-abcd \ --port 8005 \ --gpu-memory-utilization 0.8 \ --max-model-len 6832 --tensor-parallel-size 2 \ --disable-log-requests \ --disable-log-stats”“”,How can I adjust the issue of continuously generating relevant log files under temp/ray/, which takes up a lot of disk space, to prevent LLM from generating any log files through this command line to save space? I really don't want to record any log files