Closed gaocegege closed 1 year ago
INFO: 10.4.17.1:56470 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56486 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56488 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56508 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56518 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56520 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56522 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56536 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:56534 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53518 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53528 - "GET / HTTP/1.1" 200 OK
2023-06-06 10:13:46,824 - 1 - WARNING - logging.py:295 - The dtype of attention mask (torch.int64) is not bool
INFO: 10.4.2.24:42932 - "POST /chat/completions HTTP/1.1" 200 OK
INFO: Shutting down
INFO: 10.4.17.1:53544 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53556 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53562 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53568 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53574 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53580 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53594 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53606 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53608 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53624 - "GET / HTTP/1.1" 200 OK
INFO: 10.4.17.1:53630 - "GET / HTTP/1.1" 200 OK
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [1]
This is the log
This is caused by the uvicorn. The inference will block the ping endpoint.
us-central1-docker.pkg.dev/nth-guide-378813/modelzai/llm-chatglm-6b:23.06.9
The server will return 137 exit code after a request to the server. But the memory and the GPU memory usage is low (12/24GB GPU, 3GB/32GB memory)