coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

Docker image rebuilds first time launching a cluster with a new software environment #143

Closed gjoseph92 closed 3 years ago

gjoseph92 commented 3 years ago

When I create a new software environment, the first time I actually launch a cluster with it, Coiled often rebuilds the Docker image. This is confusing, because during create_software_environment, it certainly seemed like it was building the image.

This just happened to me, and I've noticed it a few times before, but I now can't reliably reproduce it. But it would go something like this:

import coiled

coiled.create_software_environment(
    name="test-rebuild",  # a name you've never used before
    pip=["dask[complete]", "xarray", "requests"],
)
# Updating software environment...
# Solving conda environment...
# Conda environment solved!
# Building Docker image
# (this takes a few minutes)
# ...and so on

cluster = coiled.Cluster(
    software="test-rebuild"
)
# Checking environment images
# Software environment not found, rebuilding.
# Building Docker image
# (this takes a few minutes)
# ...and so on, basically exact same build logs as above

The second build at cluster creation time is the annoying part. (This is with ECS + ECR, FWIW.)

necaris commented 3 years ago

This is something we hope is fixed on staging, and has improvements coming soon!

On Tue, Apr 27, 2021 at 8:24 PM Gabe Joseph @.***> wrote:

When I create a new software environment, the first time I actually launch a cluster with it, Coiled often rebuilds the Docker image. This is confusing, because during create_software_environment, it certainly seemed like it was building the image.

This just happened to me, and I've noticed it a few times before, but I now can't reliably reproduce it. But it would go something like this:

import coiled

coiled.create_software_environment( name="test-rebuild", # a name you've never used before pip=["dask[complete]", "xarray", "requests"], )

Updating software environment...

Solving conda environment...

Conda environment solved!

Building Docker image

(this takes a few minutes)

...and so on

cluster = coiled.Cluster( software="test-rebuild" )

Checking environment images

Software environment not found, rebuilding.

Building Docker image

(this takes a few minutes)

...and so on, basically exact same build logs as above

The second build at cluster creation time is the annoying part. (This is with ECS + ECR, FWIW.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

-- Rami Chowdhury coiled.io "A mind all logic is like a knife all blade -- it makes the hand bleed that uses it." -- Rabindranath Tagore

gjoseph92 commented 3 years ago

Awesome, thanks! Is this actually a region thing? My account's default region is us-west-1, but I just changed an existing environment, then launched a cluster in eu-central-1, and the cluster required a rebuild. I'm realizing that most times I've seen this probably have been when setting backend_options={"region": "..."}.

necaris commented 3 years ago

Yes -- if you change regions, it'll rebuild and re-cache the environment (ECR has region specific endpoints).

On Wed, Apr 28, 2021 at 2:57 PM Gabe Joseph @.***> wrote:

Awesome, thanks! Is this actually a region thing? My account's default region is us-west-1, but I just changed an existing environment, then launched a cluster in eu-central-1, and the cluster required a rebuild. I'm realizing that most times I've seen this probably have been when setting backend_options={"region": "..."}.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/coiled/feedback/issues/143#issuecomment-828701959, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQPYM4A2UI2KMDLU57C33TLBLBFANCNFSM43WADKTA .

-- Rami Chowdhury coiled.io "A mind all logic is like a knife all blade -- it makes the hand bleed that uses it." -- Rabindranath Tagore

gjoseph92 commented 3 years ago

This was in fact a region issue. I was launching the cluster in a different region than my account's default region.