Open jnywong opened 5 days ago
A correction:
6 - 12 US/Pacific, but individual use will likely continue after hours
@jnywong I've sent two replies to your Nov. 11 email from support@2i2c.freshdesk.com but I haven't heard back. I still don't know if email is the right avenue, or if we should be having those exchanges on a ticket system or here. So, I'll paste the exchanges here, with followups.
First I'm pasting what I submitted in my original ticket:
The second OceanHackWeek (OHW)-in-Spanish event is coming up, Nov 25-29. It follows the previous event from October (see https://github.com/2i2c-org/infrastructure/issues/4883). We have some issues that have came up now or in 2023 that I'd like to address:
Regarding the embargo and localization:
With the embargo, I do not believe there has been any progress since the previous event.
Too bad. I don't expect that we'll find a solution in the next two weeks, but it'd be really helpful if we could make some progress. Please see the brief exchange I pointed to in my request, from Feb. 2023 (starts here), especially the last comment from yuvipanda:
@emiliom indeed, we can't tell either about what falls under and what does not. Clearly GitHub is accessible, and they don't make any mention of needing OFAC permission to allow it to be accessible in Cuba. I also can't seem to find any mention in Google Cloud of network restrictions being in place. colliand has gratefully offered to chat with CS&S (our fiscal sponsors) to try figure out how to get clarity around this.
Can you comment on at least avenues for exploration of the issue with blocked access in Cuba? Has this issue not come up in 2i2c's collaboration with Metadocencia (the Catalyst project)?
For localisation, I can recommend a CI/CD workflow using GitHub and Crowdin.
Can you point me to an example of a 2i2c hub that's using this workflow, where I can see the nuts and bolts of what's being localized and how?
We can let the questions about improvements to localization sit. What we've already done covers most of what we'd want, so it's a low priority. But given 2i2c's collaboration with Metadocencia on Catalyst hubs in Latin America, I hoped that there'd be best practices already in place that we could adopt or at least examine. BTW, I was pleased to discover the materials in Spanish already created by that project for "Hub Champion training", https://catalystproject.cloud/hub-champion-training/es/. That's already helpful. Through that site I also found Catalyst hubs in Latin America that had a lot of boilerplate text on the login page already translated to Spanish.
Environment management for R image
Regarding your questions about environment management, are you using repo2docker? If so, then let me know the GitHub repo and I can take a quick look at your configuration.
We're not using repo2docker. But we do have GH actions that create a R docker image (and a separate Python image) when we update the environment. https://github.com/oceanhackweek/jupyter-image/
The configurations for the R environment are in the "r" directory, specifically in r/environment.yml
and r/Dockerfile
; r/conda-linux-64.lock
is generated from r/environment.yml
Would you still be able to help or point us in the right direction? For example, can you point us to the repo of another 2i2c hub that has a full-fledged R environment, including RStudio?
I looked around 2i2c documentation vis a vis repo2docker, and landed here: https://docs.2i2c.org/admin/howto/environment/ I'm struck that much of what's in the 2i2c base-image repo (https://github.com/2i2c-org/2i2c-hubs-image) is quite old. For example, in the Python requirements file, pandas is pinned to 1.3.5 -- ancient. The repo itself hasn't been updated in 18 months.
Our image building workflow was first created in 2021, when we first adopted a 2i2c hub. It's evolved over time, sometimes incrementally and sometimes in bigger steps. We use the GitHub container repository rather than quay.io (quay.io is recommended in the 2i2c docs). We don't use repo2docker, due to problems encountered early on.
The OceanHackWeek technical team is discussing some of the challenges with the R environment / image, including the background that led to it, at https://github.com/oceanhackweek/jupyter-image/issues/90. Feel free to chime in! In PR https://github.com/oceanhackweek/jupyter-image/pull/97 we've updated Python, pangeo-notebook
, the miniconda3 image and the rstudio-server
package. We ran into a bunch of issues with conda and libmamba (discussed there), but I think we've resolved them. The image builds w/o errors. However, RStudio is not launching when clicking on the RStudio launcher in Jupyter Lab, in a local test. After a wait of a few seconds, we get this screen:
I'll be happy if we can just get this updated image to work. I've already found a way to make it easier to manage R package dependencies.
Hi @emiliom ! I have been on annual leave and on a training course for the last 2 days – continuing on the support desk is fine, but I can continue the exchange here.
Can you comment on at least avenues for exploration of the issue with blocked access in Cuba? Has this issue not come up in 2i2c's collaboration with Metadocencia (the Catalyst project)?
I'm afraid this is beyond our control, but I can ask my colleagues about other avenues. Is a VPN out of the question?
Can you point me to an example of a 2i2c hub that's using this workflow, where I can see the nuts and bolts of what's being localized and how?
Yes! You are indeed right about our collaboration with MD on The Catalyst Project and a CI/CD workflow is set up in the following repository: https://github.com/czi-catalystproject/hub-champion-training. There is documentation for the Crowdin GH action you can follow https://github.com/crowdin/github-action.
Environment management for R image
I am going to take a 30-min timebox to investigate here and will follow up in another comment. Note that 2i2c is limited in providing bespoke support for image customisation, but we are working behind the scenes to try and improve this experience.
I was able to pull your most recent published R image into a 2i2c hub and did not come across a 500 error.
You can see from the screenshot that the RStudio launcher is available and opens RStudio fine, and that $JUPYTER_IMAGE=ghcr.io/oceanhackweek/r:41445c1
.
I believe this version is missing a few upgrades that you are trying to add in https://github.com/oceanhackweek/jupyter-image/pull/97 but I am unfortunately unable to follow your custom image-building setup.
Would you still be able to help or point us in the right direction? For example, can you point us to the repo of another 2i2c hub that has a full-fledged R environment, including RStudio?
I can point you in the direction of our community CryoCloud's RStudio configuration that follows our recommended repo2docker action: https://github.com/CryoInTheCloud/hub-Rstudio-image.
You can also try asking others on the Jupyter discourse and you are welcome to ask our other community members on our 2i2c Slack workspace.
Thanks @jnywong ! The Catalyst hub-training repo and the CryoCloud RStudio config repo look like great resources.
Regarding the embargo issue, VPN is an option for some, but I don't think it's available to everyone. Last year I spoke with someone in Brazil who used to work with Google Cloud (I think) and also with MetaDocencia. She wondered if hosting the hub on a region outside the US (eg, Brazil) would make a difference. This year we'll have potentially one participant from Cuba. Anyways, definitely not something anyone can "resolve" in the next couple of weeks, but it'll still be helpful to start scoping out options and getting more clarity.
About the R images: sorry, the published R image does work, as you saw. That's what we used in the October event. Our attempts to upgrade it are the ones that are still not working out. Thanks for the pointer to a 2i2c Slack workspace! That could be really helpful. I didn't know about it, and I can't find a link to it at either https://2i2c.org or https://docs.2i2c.org; could you send me an invitation or point me to where I can join, offline?
The link towards the Freshdesk ticket this event was reported
https://2i2c.freshdesk.com/a/tickets/2426
The GitHub handle or name of the community representative
@emiliom
The date when the event will start
Monday, Nov 25
The date when the event will end
Friday, Nov 29
What hours of the day will participants be active? (e.g., 5am - 5pm US/Pacific)
6 - 12 US/Pacific, but individual use will likely continue after hours
Are we three weeks before the start date of the event?
Number of attendees
50
Make sure to add the event into the calendar
Does the hub already exist?
The URL of the hub that will be used for the event
https://oceanhackweek.2i2c.cloud/
Will this hub be decommissioned after the event is over?
Task list
Definition of Done