coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

Failing software environment does not err #147

Closed mrocklin closed 1 year ago

mrocklin commented 3 years ago

When I run the following:

import coiled
coiled.create_software_environment(
   name="mrocklin/pangeo",
  conda={
      "channels": ["conda-forge", "defaults"],
        "dependencies": ["xarray", "dask", "intake", "zarr", "s3fs"],
    },
)

I get the following output

Updating software environment...
Creating new software environment
Creating new ecr build

STEP 1: FROM coiled/default:sha-6b4e896
STEP 2: COPY environment.yml environment.yml
--> 2878bad4870
STEP 3: RUN conda env update -n coiled -f environment.yml     && rm environment.yml     && conda clean --all -y     && echo "conda activate coiled" >> ~/.bashrc
Collecting package metadata (repodata.json): ...working... done

And then I get a prompt again

In [2]: 

But the environment has not successfully built. I suspect that there

mrocklin commented 3 years ago

FWIW I'm mostly unable to use conda-forge effectively

FabioRosado commented 3 years ago

Thank you for raising this issue @mrocklin I have a suspicion as to why this might be happening, let me try running it locally to confirm

NJ-Greg commented 3 years ago

I did not set out to reproduce this, but inadvertently I have found the same problem.

Code ``` import coiled from rich import print from dask.distributed import Client import dask.dataframe as dd from time import sleep import datetime def time_stamp(message): now = datetime.datetime.now() format = "%Y-%m-%d %H:%M:%S" dt = now.strftime(format) print(f"{message} at: [bold black]{dt}[/bold black]") TEST_SENV = "greg-smith/senv-create-test" TEST_CLUSTER = "dashboard-example-cluster" ​ SENV = {"channels":["default","conda-forge"], "dependencies":["dask==2021.06.0", "blosc", "lz4", "numpy", "matplotlib", "s3fs", "xarray", "numba", "ipywidgets"]} ​ time_stamp("[bold green]Checking for {TEST_SENV}[/bold green]") env_list = coiled.list_software_environments() if TEST_SENV in env_list: time_stamp(f"[bold green]Found {TEST_SENV}[/bold green]") else: time_stamp(f"[bold green]Creating {TEST_SENV}[/bold green]") try: coiled.create_software_environment(name=TEST_SENV, conda=SENV) except Exception: time_stamp(f"[bold red]Failed to create {TEST_SENV}[/bold red]") Checking for {TEST_SENV} at: 2021-06-05 21:29:20 Creating greg-smith/senv-create-test at: 2021-06-05 21:29:21 Updating software environment... Creating new ECR repository Creating new software environment Creating new ecr build STEP 1: FROM coiled/default:sha-6b4e896 STEP 2: COPY environment.yml environment.yml --> 5e561be1076 STEP 3: RUN conda env update -n coiled -f environment.yml && rm environment.yml && conda clean --all -y && echo "conda activate coiled" >> ~/.bashrc Collecting package metadata (repodata.json): ...working... done ```

If I change the value of SENV, then it works:

New SENV ``` SENV = {"channels":["default","conda-forge"], "dependencies":["dask==2021.06.0", "blosc", "lz4", "numpy", "s3fs", "ipywidgets"]} ```

The difference here is the presnce of "matplotlib", "xarray", "numba",

[Edit - now I notice that Kris has seen this as well, and Dan seems to have a good diagnosis, in #2841]

dantheman39 commented 3 years ago

Yes there is an open PR to properly display the error when the websocket connection is unexpectedly broken, and work on the underlying infrastructure problem that causes the websocket hangup will start Monday.

https://github.com/coiled/cloud/pull/2843

NJ-Greg commented 3 years ago

@FabioRosado requested I repeat my prior test. As of 2021-06-30, using coiled==0.0.43 on production the following code ran in 13 minutes 18 seconds without throwing any errors.

import coiled
TEST_SENV = "senv-create-test"
SENV = {"channels":["default","conda-forge"],
        "dependencies":[
            "dask==2021.06.0", 
            "blosc", "lz4",
            "numpy",
            "matplotlib",
            "s3fs",
            "xarray", 
            "numba", 
            "ipywidgets",
        ],
       }
coiled.create_software_environment(name=TEST_SENV, conda=SENV)
shughes-uk commented 1 year ago

Fully resolved by software environment rewrite