ocean-transport / coiled_collaboration

Repository to track and link issues across other repositories, which are relevant to the Abernathey Lab and Coiled
0 stars 0 forks source link

Running AMOC heat transport workflow on coiled #13

Closed cspencerjones closed 3 years ago

cspencerjones commented 3 years ago

@rabernat suggested that we might try running my AMOC heat transport workflow on coiled, since the original data are on AWS.

The main thing that I would be worried about is that I currently write some intermediate data to the Pangeo scratch bucket on AWS. Do we know if this would be possible with coiled?

rabernat commented 3 years ago

Do we know if this would be possible with coiled?

Hmm good question. The credentials are coming from the kubernetes environment, so we don't have access to the actual key / secret. We would need input from @jrbourbeau et al about whether we can propagate the credentials to the coiled environment.

jrbourbeau commented 3 years ago

Coiled will forward local AWS credentials to the remote cluster running on AWS. The idea is that the workers in your cluster should be able to access anything you can access locally (where you're Client is running). Is there a way for you can get access to AWS credentials on your local machine?

If this isn't straightforward with your current setup, you could always run Coiled from within Pangeo Cloud. That way, the credentials you already have set up in the Kubernetes environment will be forwarded to your Coiled cluster.

jbusecke commented 3 years ago

I just successfully wrote to the AWS scratch bucket, by using pangeo cloud as client (with custom environment) and coiled as cluster!

You can find the environment and notebook I used here.

The only additional step I had to take was to install the environment locally mamba env create -f environment.yml (which you would have to do each time you start a pangeo cloud server I guess) and then run the notebook with that kernel.

rabernat commented 3 years ago

Is there a way to bypass the environment creation step and just point to an existing docker image? All pangeo cloud environments use images from https://github.com/pangeo-data/pangeo-docker-images.

jbusecke commented 3 years ago

I think this is what you are looking for?

rabernat commented 3 years ago

Yes, perfect. So just cross reference with pangeo-cloud-federation to see what docker image to use.

For the staging AWS cluster it is: https://github.com/pangeo-data/pangeo-cloud-federation/blob/staging/deployments/icesat2/image/binder/Dockerfile (currently pangeo/pangeo-notebook:2021.05.04)

For the production cluster: https://github.com/pangeo-data/pangeo-cloud-federation/blob/prod/deployments/icesat2/image/binder/Dockerfile pangeo/pangeo-notebook:2021.02.02

jrbourbeau commented 3 years ago

I just successfully wrote to the AWS scratch bucket, by using pangeo cloud as client (with custom environment) and coiled as cluster!

Woo, that's super cool to see!

Yeah, since Pangeo already handles building / deploying Docker images, we can just have Coiled use those images and everything should "just work". I see @jbusecke beat me to the punch on that : )

@cspencerjones I think doing something similar to @jbusecke's notebook, but with

coiled.create_software_environment(
    name='pangeo-cloud-staging',
    container='pangeo/pangeo-notebook:2021.05.04',   # matches Pangeo Cloud AWS staging cluster
    # container='pangeo/pangeo-notebook:2021.05.04',   # matches Pangeo Cloud AWS production cluster
)

should work for you to run your AMOC heat transport workflow on Coiled. Let me know if you have any questions or run into any bumps along the way

cspencerjones commented 3 years ago

This seems to work great in the staging environment. The version of dask is too old in the production environment right now. But I think that's fine as a temporary solution, and presumably the production environment will be updated soon.

Thanks for your help.

cspencerjones commented 3 years ago

OK, I am closing this. But related issues are coming.

jrbourbeau commented 3 years ago

Glad things are up and running. Looking forward to the coming issues : )