RayWorkerVllm Actor Dies After ~1h: The actor is dead because all references to the actor were removed.

Testing out RayLLM and having issues where the model loads and runs fine initially but starts throwing errors after ~1hr of being running. This happens on multiple types of models. Example shown below using the model config from this repo.

RayService Configuration:

apiVersion: ray.io/v1alpha1
kind: RayService
metadata:
  name: laivly-ml
  namespace: sidd-platform
spec:
  serviceUnhealthySecondThreshold: 1200 # Config for the health check threshold for service. Default value is 60.
  deploymentUnhealthySecondThreshold: 1200 # Config for the health check threshold for deployments. Default value is 60.
  serveConfigV2: |
      applications:
      - name: router
        import_path: rayllm.backend:router_application
        route_prefix: /llm
        args:
          models:            
            - ./models/continuous_batching/quantization/TheBloke--Llama-2-7B-chat-AWQ.yaml
  rayClusterConfig:
    headGroupSpec:
      rayStartParams:
        resources: '"{\"accelerator_type_cpu\": 2}"'
        dashboard-host: '0.0.0.0'
      template:
        spec:
          containers:
          - name: ray-head
            image: anyscale/ray-llm:0.5.0
            resources:
              limits:
                cpu: 2
                memory: 8Gi
              requests:
                cpu: 2
                memory: 4Gi
            ports:
            - containerPort: 6379
              name: gcs-server
            - containerPort: 8265 # Ray dashboard
              name: dashboard
            - containerPort: 10001
              name: client
            - containerPort: 8000
              name: serve
          nodeSelector:
            kubernetes.io/arch: amd64
    workerGroupSpecs:
    - replicas: 1
      minReplicas: 0
      maxReplicas: 4
      groupName: a10-gpu
      rayStartParams:
        resources: '"{\"accelerator_type_cpu\": 46, \"accelerator_type_a10\": 4}"'
      template:
        spec:
          containers:
          - name: llm
            image: anyscale/ray-llm:0.5.0
            lifecycle:
              preStop:
                exec:
                  command: ["/bin/sh","-c","ray stop"]
            resources:
              limits:
                cpu: "46"
                memory: "190G"
                nvidia.com/gpu: 4
              requests:
                cpu: "2"
                memory: "4G"
                nvidia.com/gpu: 4
            ports:
            - containerPort: 8000
              name: serve
          nodeSelector:
            karpenter.k8s.aws/instance-family:  g5

ray-project / ray-llm

RayWorkerVllm Actor Dies After ~1h: The actor is dead because all references to the actor were removed. #140