We should use psutil list all process for vllm, ray, sglang and kill them instead of trying to kill from the port. this can sometimes orphaned/zombie proccesses
the health thread which is restarting the model should present the life cycle of the model in a file for other threads to provide detail to the requesting service. We may want to make this a 429 to make the requestor wait