coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

How are cores counted? "Requested core count is beyond user limits" #124

Closed rubenvdg closed 3 years ago

rubenvdg commented 3 years ago

If I make a cluster by using:

import coiled
cluster = coiled.Cluster(
    name="clustertje",
    n_workers=80,
    worker_cpu=2,
    worker_memory="16 GiB"
)

I catch Exception: worker_count_requested={'message': 'Requested core count is beyond user limits', 'request': '320', 'active': '2', 'limit': '200'}.

I would expect that my requested core count is 160 (<200). How is it 320?

FabioRosado commented 3 years ago

Hello Ruben thank you for the question, we currently count the core numbers as the sum of all running and pending schedulers cpus, plus the worker count times cpu requested. We also include the number of cpus that jobs and notebooks take.

Soon we will add a card on the dashboard that will show you how many cores have you used so far and we are adding a command to display the core usage as well.

Although there might be an issue here, please bear with me as I check something 👍

mrocklin commented 3 years ago

Given the response in the exception my guess is that you have another cluster running. You should be able to view all running clusters and stop idling ones on the dashboard available at cloud.coiled.io . Idle clusters should clean themselves up after 20 minutes of inactivity.

On Wed, Mar 3, 2021 at 8:01 AM Fábio Rosado notifications@github.com wrote:

Hello Ruben thank you for the question, we currently count the core numbers as the sum of all running and pending schedulers cpus, plus the worker count times cpu requested. We also include the number of cpus that jobs and notebooks take.

Soon we will add a card on the dashboard that will show you how many cores have you used so far and we are adding a command to display the core usage as well.

Although there might be an issue here, please bear with me as I check something 👍

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/coiled/feedback/issues/124#issuecomment-789734351, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKZTBI5TRUSGNEYQXEWJTTBY6J7ANCNFSM4YRHMNPQ .

rubenvdg commented 3 years ago

Thanks. I did stop all clusters in the console before. But in hindsight I was probably too impatient (and they were still being teared down).

FabioRosado commented 3 years ago

Hello Ruben, I just wanted to give you an update about this. The cores are now counted as you would expect - we have also added a core usage table to the UI and a list_core_usage() command that will show you a nice table with your current usage.

I am closing this issue, but please feel free to open or create a new one if you encounter any issues