Open jfabdo opened 4 years ago
It looks like Docker can be used on Jupyter notebooks, although I'll be the first to admit that I have never used one.
I was watching some of the B612 videos on YT, and Dr Lu mentioned that this was run in the cloud (which makes perfect sense). It looks like Google is one of your partners so I assume that you are using GCP. Containerizing this with docker would allow you to build it in a way that it runs the same way on all machines it runs on (if you're familiar with Docker, please forgive the explanation). Deployment could be automated and would be much quicker with much fewer chance of errors.
If you'd like, I can fork this project and write up a Dockerfile for it. You can try it out and see if it works better for people running it.
We'd be happy to receive contributions here :)
Yes, we do run ADAM in GCP -- specifically, our backend infrastructure (API, processing engine), which does the actual computations. We've mainly just been using the Jupyter notebooks to script out how we want to send and receive data to/from the API.
Thank you for being open to suggestions :)
This would only affect deploying to and running in the cloud, not any computer hitting the API. I've used AWS more extensively than I have GCP, but in AWS, at least, it makes it much simpler to deploy if you already have a container; it removes the installation step (if that's how you're deploying to the cloud). I'll write it up and you can see if this works for your flow.
I wrote this up last night and this morning:
FROM continuumio/miniconda3:latest
RUN mkdir /src
WORKDIR /src
RUN conda update conda
COPY . .
RUN conda create -n adam-dev --file conda-requirements.txt \
&& conda init bash \
&& . /root/.bashrc \
&& conda activate adam-dev \
&& python setup.py develop \
&& pytest adam --cov=adam --ignore=adam/tests/integration_tests
# log in
# RUN adamctl login
# # log into the 'dev' environment as well
# RUN adamctl login dev https://adam-dev-193118.appspot.com/_ah/api/adam/v1
# # set up your workspaces
# RUN adamctl config envs.prod.workspace "uuid-received-from-@AstrogatorJohn"
# RUN adamctl config envs.dev.workspace "uuid-received-from-@AstrogatorJohn"
CMD tail -f /dev/null
but it's failing the unit tests so I wanted to be sure that it would work with your flow. especially since I don't have a login. But there are ways to pass those in in docker.
To run this, paste this into the home folder and run docker build --tag adam .
then docker run adam
to run.
I have a few questions before I can do a pull request: is adam_home just the client, or is it what is running in GCP as well? What does logging in do? Is this to coordinate the separate machines?
This looks good! Thanks 👍
adam_home (which we should probably, eventually, rename) contains both the Python client, as well as our demo Jupyter notebooks, which have served as starting points for people's analysis.
The API is a frontend for our processing engine. In order to create jobs (e.g. to do computations) in ADAM, one needs to authenticate with ADAM. That's where logging in comes in. We currently don't have open registration, though we are working on that.
A little bit more about the logging in and why adamctl
exists... I vaguely recall that the login process from Jupyter to oauth2 was something that was confusing to users, so the adamctl library was created to initiate the oauth2 flow and store credentials -- meant to be a one-time thing. The GCP cli (gcloud) does something similar, where before you can interact with gcloud and your GCP environment, you're required to authc with GCP and gcloud stores the credentials/handles refreshing access token/etc. I'm pretty sure AWS, Azure, etc. have a similar flow, though it's been a long time since I've used their cli tools.
Regarding the unit tests, they've been broken for awhile unfortunately. We're working on fixing them but there are other projects we've been focusing on, which we are close to wrapping up.
Good questions, and it tells me that it's important that we produce some more documentation :)
Great, no worries! I modified the Dockerfile accordingly and added instructions to the readme and created a pull request. I tested it as much as I can, but unfortunately, I do not have UUID so I cannot test it further. Let me know if any changes need to be made.
Thanks Carise!
Thanks! I saw the PR and will have some comments within the next day or so.
Thank you!
Hi Jacqueline, thanks for your question! The ADAM client was intended to be a Python package that is easily imported to say, a Jupyter notebook. Curious to know if you have any suggestions for improvement though?