kubeagi / arcadia

A diverse, simple, and secure one-stop LLMOps platform
http://www.kubeagi.com/
Apache License 2.0
64 stars 21 forks source link

fix: hardcode resouce request of gpus to 1 if utilize a existing ray … #942

Closed bjwswang closed 3 months ago

bjwswang commented 3 months ago

…cluster

What type of PR is this?

What this PR does / why we need it

Which issue(s) this PR fixes

Fixes #

Special notes for your reviewer

Verified on:

  1. single node & 2gpus (WITH resource limit of "nvidia.com/gpu = 2")

    image
  2. two nodes & each have 1 gpu

    • RAY_CLUSTER_INDEX = 0
    • nvidia.com/gpu = 2 image

Pod logs: image

bjwswang commented 3 months ago

@nkwangleiGIT FYI