Closed cmbz closed 9 months ago
2024/01/26
I'll be using this issue to track the effort of porting the GEOS-Chem supplied notebook into something that can be deployed on MERC and used in the context of the MOC-PoC presentation.
Thanks @landreev But please give me a heads' up before you close the issue.
Wasn't planning to close it, no. @pdurbin and I briefly discussed opening a separate local issue to track the dev. effort needed to port the notebook into the framework of the demo. We decided to use this one instead, since it was already there. But I can still open a new one if you prefer.
(I just wanted to have something "in progress" to reflect this task, since this is the focus of the MOC-POC effort at this point).
@landreev totally fine to keep on with this issue!
If it helps, I now have OpenShift running locally on my laptop because I was looking at this PR:
I just started a thread in Slack if it would be helpful or interesting for me to try to run the notebook on my local version of OpenStack, but I'm not sure where to begin.
I'm not sure if this helps or not @r1beguin recently demoed launching JupyterLab from Dataverse. Here are some screenshots from his July 2023 community call presentation:
The code is here: https://forgemia.inra.fr/dipso/eosc-pillar/dataverse-jupyterhub-connector
I just merged a PR where there's a nice writeup of the tool on our "integrations" page:
Also, from their README, here's a diagram of how it works:
JupyterHUB, not "Lab", right?
@landreev whoops, yes, hub not lab.
Can this setup be used in our case, for the purposes of the demo? - I don't fully understand this part.
I passed the ssh key to a NERC VM to Bob Yantosca, the author of the notebook, yesterday and asked him to install it, replicating the environment under which he developed it. The data files are already saved on the instance locally. I'm waiting to hear from him. Once it's running like that, we'll at least be able to see what it's looking like, and then we can add extras to it - the storage calls, the passing of parameters and figuring out how it can be deployed in a container. So this is the extent of my current plan.
I have a very crude/fake/hard-coded/everything glued together with dog drool kind of a demo that nevertheless ties the pieces together - the dataset with the GEOS-Chem datafiles in it and the "external tool" that sends the user to the statistics notebook, that in turn generates pretty graph images. I will post links/images in the slack channel as a quick status update, and will continue working making the whole thing less fake/hard-coded.
I marked the remaining demo-related items on the checklist as completed and I'm removing my name from the issue (@cmbz you asked me not to close it - so, leaving it as is). This is under the assumption that this completed "for the purposes of the demo presentation", as a quick proof of concept only. I will open a new issue in the main repo for working out a real infrastructure setup that will allow users to run arbitrary, non-hard coded computation code on a cluster. That is the next logical step, and it makes sense to work on this while we have access to the NERC cluster facilities.
I'm not actively working on this so I removed my name as well.
Closing issue as complete. Follow up work to create Harvard Dataverse Repository GEOS-Chem collections will continue here: https://github.com/IQSS/dataverse-pm/issues/178
Overview
Two-phase project to investigate and pilot large data and computation support for GEOS-Chem datasets using a containerized Dataverse installation running on Mass Open Cloud resources.
The proof-of-concept will be demoed at the Mass Open Cloud Alliance Conference (2024/02/28)
Participants
Timeline
Tasks
January, 2024 and February, 2024
March, 2024
Related
Resources