Closed colliand closed 2 years ago
Based on today's call with @jameshalgren, I suggest the following onboarding process. CIROH and AWI have ambitious plans so it's important we get the initial conditions right.
shared
directory with some sample notebooks.Suggested plan LGTM, @colliand. I added the request to the backlog board and we will find the eng resources so we can push forward on step 2 in a timely manner.
@jameshalgren, we will ping you soon with some questions about the specific of the hub deployment.
Thanks @colliand, @damianavila. Processing, will respond soon.
A few questions, possibly specialized, probably going beyond the scope of this issue. Tagging @colliand to ask for redirection or moderation if necessary.
[x] Do the notebooks/lab functions in the hub support --collaborative
use?
[x] Are there examples of building the hubs on top of open data cubes or other kinds of data cubes/lakes/meshes*?
[x] (More specifically...) In your circle of influence and acquaintance, do you know anyone working on addressing this tantalizing advertisement of future functionality:
Lots of evolution in this terminology; this is just one link from my modest attempt to survey the panoply.
Tagging @whitelightning450 @karnesh for situational awareness.
Hi James! Yes 2i2c has experience with the real-time-collaboration features in upstream Jupyter. Experiments have shown that feature is not ready for production deployments. There is ongoing work there and 2i2c will support RTC when we can do so securely and robustly.
Yes, our team is contributing to the "tantalizing future" you referenced. The pioneering work of the Pangeo community is an inspiration for the founding of 2i2c. We are in the process of on-boarding a new team member @jmunroe who has technical and community experience with big data geosciences. I spent some time briefing him on CIROH/AWI today and expect he will be an excellent resource for our collaboration.
@jameshalgren I haven't seen (which doesn't mean they don't exist, obviously) examples of hubs tightly integrated with ODCs. But from a quick look at the ODC setup, I see a key element of this is having an accessible Postrgres server to manage the actual data catalogs and serving.
Coincidentally, as part of the Jupyter Meets the Earth effort, with @consideRatio and @yuvipanda we're looking right now at how to most cleanly set up a persistent, robust and cost-effective Postgres server that can be accessed by all the users of a Hub. We happen to need that for one of our research projects, and our current solution (via sqlite) is sub-optimal.
We'll be happy to share any progress we make on that front back with the rest of the team - just today I was discussing with @consideRatio how this was very likely to be a use case that many others would be likely to encounter. So I'm delighted to see that intuition confirmed by your needs, and it means it's all the more timely that we make progress on it :)
Assuming the National Water Model will be a key dataset used by this hub, I'll note a few other links
This is in additional to the NWM data store on GCP linked above.
I am interested in identifying other key datasets that the community will anticipating using on this hub to ensure it is being set up in a way that accessing that data is straight forward for users.
Thanks @jmunroe! I'll add @jameshalgren here in case he can share any other input on important data sets for the emerging CIROH community.
Thanks @colliand and @jmunroe. I've jotted down a few thoughts/responses to launch the weekend:
Assuming the National Water Model will be a key dataset used by this hub, I'll note a few other links
It will be the key dataset used in this hub, together with observation data initially from USGS, but from any valid source.
About the The National Water Model from the Office of Water Prediction
- Includes links to HTTP and FTP sites of the last two days output of the NWM.
I think it is http only at this point. There are ftp-versions using the LDM protocol for direct sharing of data between NWS offices, but that's probably not relevant here for the moment. FWIW, the NOMADS servers also host all of the NWS weather model output -- though the storage formats are far from optimal for cloud access, just like the NWM data.
Amazon Web Services (AWS) Open Data Sponsorship Program Datasets:
There is a 1.2 GCP bucket of the same data (they use the label 'reanalysis' which is technically incorrect...). The AWS version of that data is more complete, with the 1.2, 2.0, and 2.1 versions of the retrospective data, along with experimental (?) versions with subsets of the data in zarr formats.
The GCP bucket mentioned is a superset of the S3 resource, with the analysis, short (on s3), medium, and long-range output. In fact, only a handful of specific derived products appear to be missing from the GCP bucket relative to what is available on the direct download from NOMADS.
- Additional resources on the NWM
Hopefully, some of what we make here can allow for Dr. Maidment's work to be more easily contributed back to the broader NWM community. He and his team were critical influencers in the initiation of the project and continue to generate great work!
This is in additional to the NWM data store on GCP linked above.
I am interested in identifying other key datasets that the community will anticipating using on this hub to ensure it is being set up in a way that accessing that data is straight forward for users.
I mentioned USGS data. There is a useful toolset for accessing USGS data and we may use that or replicate a portion into storage on the cloud backend. I am aware of a similar script by @groutr.
Those observed streamflow (which are really observed stream-stage data converted to estimated flow -- but the convention is to call them streamflow...) data will be the key initial dataset because they are the key output from the model . As we continue, additional variables will be examined and we will have to identify or create repositories of validation data to use for exploration.
A few questions for all of you 😉
Let's get started with a Pangeo-style Daskhub. The capacity of the team at AWI is increasing and a customized software environment will likely be ready later in the year.
OK, so starting with the pangeo-notebook image is enough to start with, I presume. Can you confirm?
I suggest this hub offer the VNC/Linux desktop feature.
IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.
This hub should be hosted on GCP in a data center that hosts the National Water Model Data.
Are we talking about a dedicated cluster here? Or are you OK with the hub being deployed in a shared cluster? (@colliand do you have any more information about this aspect from the lead process? Thanks!)
I suggest this hub offer the VNC/Linux desktop feature.
IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.
@yuvipanda it would be great if we could take this opportunity to document how to setup this feature in the hub features docs
I suggest this hub offer the VNC/Linux desktop feature.
IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.
@yuvipanda it would be great if we could take this opportunity to document how to setup this feature in the hub features docs
For reference, I think this is solely something to setup in the user image. This is what JMTE has done to support this functionality.
It is then represented as the "Desktop" icon in the JupyterLab launcher.
Yes, I suggest that the AWI/CIROH hub be set up on a dedicated GKE cluster on the data center where the NWM data is hosted. I suggest that 2i2c manage the billing account for the cluster with the monthly cloud usage costs passed through to AWI. AWI/CIROH may choose to take over the billing account as the service and their devops capacity expands.
I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub. The suite of integrated tools in that hub is tuned to support collaborations like those envisioned by CIROH.
I suggest that 2i2c manage the billing account for the cluster with the monthly cloud usage costs passed through to AWI. AWI/CIROH may choose to take over the billing account as the service and their devops capacity expands.
This sounds like we should create a new billing account and not just use the two-eye-two-see one, no?
P.S. It also looks like I don't manage the two-eye-two-see billing account, so I can't create a project attached to that one in the interim
I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub. The suite of integrated tools in that hub is tuned to support collaborations like those envisioned by CIROH.
Link to that hub for reference?
@whitelightning450, @hellkite500, @aaraney, @karnesh, @mgdenno -- have been meaning to loop you in here so you can follow the development here.
@quebbs -- hello! -- tagging you ahead of upcoming discussion. This may be a tool to put to use.
Ok, I have created a new GCP account to deploy this into. I have connected the 2i2c billing account for now, and we can decide to change that later if needed.
(Big gold star ⭐ to Chris for figuring that out!)
on the data center where the NWM data is hosted
Can we be a bit more specific about this please? The NWM data is multi-regional in the US: so is us-central1-b
ok? Do we envision this hub wanting to use GPUs in the future (then we should go with us-central1-c
)?
Do we envision this hub wanting to use GPUs in the future (then we should go with us-central1-c)?
@colliand was that piece part of the conversation? @jameshalgren, any input about this one?
I think we want to avoid f-35 syndrome. Let me check with a couple of others but I think we can do plenty with GPCPUs for now.
Having the option in the future might be useful. What are the trade-offs for going to the data center where GPUs are available?
Having the option in the future might be useful. What are the trade-offs for going to the data center where GPUs are available?
As far as I'm aware, none. We often put research hubs in that zone in case they want to upgrade to GPUs later, since moving the cluster after the fact would involve destroying it and redeploying.
I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub. The suite of integrated tools in that hub is tuned to support collaborations like those envisioned by CIROH.
Link to that hub for reference?
@jmunroe here is links to JMTE hub:
I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub.
Note that I meant only to describe how to setup the ability to access a remote desktop interface, which is something that can be configured in the user environment like described here.
A way to setup the hub to have this kind of functionality would be to bootstrap the hub with a image with such parts.
Hi @jameshalgren. This deck created by @fperez describes some of the features of the Jupyter Meets the Earth (JMTE) hub. This is an opinionated curated integrated "batteries included" deployment that goes beyond the (already awesome) JupyterLab. After the CIROH/AWI hub is launched, I look forward to working with you to organize a kickoff event for you and your community champions in which Fernando gives a demonstration.
In an exchange elsewhere, I learned from @consideRatio that the JMTE hub has the following extra features:
Below are are some non-2i2c-default features used within the JMTE hub.
syncthing
conda package and the jupyter-syncthing-proxy
pip package should be installed, see:
@jameshalgren Can you please provide the list of GitHub Teams you would like to have access to the hub?
I am struggling to install TurboVNC with the provided code snippet and receiving the following error:
E: Invalid archive signature
E: Internal error, could not locate member control.tar{.zst,.lz4,.gz,.xz,.bz2,.lzma,}
E: Could not read meta data from /home/jovyan/turbovnc.deb
E: The package lists or status file could not be parsed or opened.
@sgibson91 seems like you have the exact same code snippet and a similar base image as in https://github.com/pangeo-data/jupyter-earth/blob/master/hub.jupytearth.org-image/Dockerfile. So, maybe the apt install
step crashes because of something missing, such as build-essential
?
Hmmm, googling on the errors, I see notes about apt clean
etc. Also, I note that you have a step before using apt update
that didn't end with a cleanup step. Maybe that could help? This is a wild guess without motivation.
Thanks @consideRatio. I added the clean-up step to the earlier apt update
invocation, and that produced a new error related to "held broken packages". So I added an apt update
and the clean-up step to the TurboVNC step and now it builds successfully 🤷🏻
Final commit looks like this: 2i2c-org/awi-ciroh-image@6d4f05c
(#1)
@jameshalgren Can you please provide the list of GitHub Teams you would like to have access to the hub?
@sgibson91 -- alabamawaterinstitute
, please, and NOAA-OWP
Thanks!
alabamawaterinstitute
, please, andNOAA-OWP
These are organizations, I was under the impression you wanted specific teams to have access? E.g. the tech-team that is a member of the 2i2c org -> https://github.com/orgs/2i2c-org/teams/tech-team
Ah pardon me, I think I'm misremembering another hub setup issue where a question was raised about subteams
Ah pardon me, I think I'm misremembering another hub setup issue where a question was raised about subteams @sgibson91 10-4 -- we may refine later, but I'm assuming that is a simple process.
I'm assuming that is a simple process
Absolutely.
The hubs are available here:
Please note these docs about authorising the GitHub app for the first time: https://infrastructure.2i2c.org/en/latest/howto/configure/auth-management.html#follow-up-github-organization-administrators-must-grant-access
@consideRatio are there any other setup steps regarding the VNC/Linux desktop? I would've expected a button on the Lab Launcher saying "Desktop", but it's not there. Also changing /lab
to /desktop
in the URL returns a 404 😕 Maybe @yuvipanda can help too?
Image repo: https://github.com/2i2c-org/awi-ciroh-image
This is what is done in the JMTE image, which is based on a pangeo-notebook base image: https://github.com/2i2c-org/infrastructure/issues/1444#issuecomment-1187405324. I don't think anything else is needed!
🤔 Hmmm ok, maybe Yuvi can help me debug when he's online then
@sgibson91 I would suspect https://github.com/2i2c-org/awi-ciroh-image/commit/7b080bef9a29e7d791e62058229f1946812f403a#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R32-R33 could be to blame. I don't understand how jupyter-server-proxy registers things to show up in jupyterlab and start up properly, but jupyterlab presents icons for notebook / kernels etc, and maybe there is a common mechanism in play related to removing nb_conda_kernels
.
Hmmm, thinking about it, if you don't succeed in accessing /user/some-name/desktop, it makes me think that jupyter-server-proxy has failed to start. That I know from experience can happen if some other jupyter-server-proxy package fails to load properly. So, something else registering itself with jupyter-server-proxy may be to blame.
Yeah, tbh, I'm just guessing and used https://github.com/2i2c-org/coessing-image/blob/main/Dockerfile as a starting point (before the Julia addition :D)
The hubs are available here:
Awesome! Does this mean we can get in a start trying things out (I assume this will begin to incur cloud costs...)?
@jameshalgren yes and yes :) I'm still trying to figure out the VNC/Linux desktop feature though
I made some progress in PR https://github.com/2i2c-org/awi-ciroh-image/pull/3 I now have the Desktop icon on JupyterLab's launcher (I'm testing this on the staging hub).
However when I click on it, I see "Something went wrong, connection is closed"
Logs from my user server (k logs jupyter-sgibson91
) show:
[I 2022-07-25 16:05:12.314 SingleUserNotebookApp handlers:432] Trying to establish websocket connection to ws://localhost:5901/websockify
2022-07-25 16:05:12,316 - SingleUserNotebookApp - ERROR - Uncaught exception GET /user/sgibson91/desktop/websockify (10.128.0.3)
HTTPServerRequest(protocol='https', host='staging.ciroh.awi.2i2c.cloud', method='GET', uri='/user/sgibson91/desktop/websockify', version='HTTP/1.1', remote_ip='10.128.0.3')
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/tcpclient.py", line 138, in on_connect_done
stream = future.result()
tornado.iostream.StreamClosedError: Stream is closed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
await open_result
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 672, in open
return await super().open(self.port, path)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 494, in open
return await self.proxy_open('localhost', port, proxied_path)
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 444, in proxy_open
await start_websocket_connection()
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 435, in start_websocket_connection
self.ws = await pingable_ws_connect(request=request,
File "/srv/conda/envs/notebook/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
future.result()
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/iostream.py", line 1205, in connect
self.socket.connect(address)
OSError: [Errno 99] Cannot assign requested address
@GeorgianaElena suggested some missing packages in https://github.com/2i2c-org/awi-ciroh-image/pull/3#pullrequestreview-1050597105 and now the desktop feature is available!
Thanks @jameshalgren. I fixed the link to point to the intended slide deck created by Fernando.
Now that the production and staging hubs are available, I suggest to @jameshalgren that we organize a kickoff event for CIROH personnel who will manage the hub with @jmunroe @fperez (and perhaps others on the 2i2c team). Perhaps we can link up for a phone call to discuss some launch planning?
@colliand -- targeting 23 August for a technically focused demo.
I think we can close this issue (new hub set up) by now (since I believe it is completed) and continue the conversation on new issues.
Thanks @damianavila -- new issues (as needed) are still posted under this repository, correct? (and, for that matter, thanks @colliand, @sgibson91, @consideRatio, @fperez, and @jmunroe and all the rest -- we're excited!)
@jameshalgren, for follow-up questions/requests I would suggest using our support email channel. Over there, we will be able to provide useful feedback and, in some cases, open issues in specific repositories accordingly to the topic you are rising in that conversation.
Hub Description
The Alabama Water Institute (AWI) is convening a consortium of 28 university partners to improve water management for the USA. The announcement of the award to support the collaboration called CIROH is available here.
2i2c has been engaged to provide interactive computing service supporting this collaboration.
The service will initially use GitHub auth using an allow list based on membership in AWI GitHub organization. As the service evolves, I anticipate we may move over to CIlogon.
Community Representative(s)
@jameshalgren
Important dates
Notes: dates are updated accordingly to new information and prioritization.
Hub Authentication Type
GitHub Authentication (e.g., @mygithubhandle)
Hub logo information
URL to Hub Image:
URL for Image Link: {{ URL HERE }}
Hub user image
Extra features you'd like to enable
Other relevant information
Let's get started with a Pangeo-style Daskhub. The capacity of the team at AWI is increasing and a customized software environment will likely be ready later in the year.
I suggest this hub offer the VNC/Linux desktop feature.
This hub should be hosted on GCP in a data center that hosts the National Water Model Data.
Hub URL
ciroh.awi.2i2c.cloud
Hub Type
daskhub
Tasks to deploy the hub