Closed colliand closed 1 year ago
Let's use the jupyter/scipy-notebook
image for python, and perhaps the rocker/binder
image for R. However, neither of these actually have nbgitpuller
in them, which can be problematic if they are using it. I had opened https://github.com/jupyter/docker-stacks/pull/2000 to put nbgitpuller in jupyter docker-stacks.
There's no staging hub for UC merced - I think perhaps we should change that too #3150 so we can deploy things there for them to test. I think that's the easiest way to do this.
Here's what I think is the easiest, least stressful thing for engineers to do:
Given the primary reason this is happening is because we're retiring the image we maintain that has both R and python, I think this is the easiest way to do.
I'm happy to provide review but would like someone else to be assigned to this :)
Thanks @yuvipanda! FYI @damianavila I need to reply to Sarvani with a date when the service will change to include the R + Python images drop-down menu. She asked me via text today.
@GeorgianaElena is going off the support triage cycle on Tuesday, so I assigned her this one to be worked on during the Wed-Fri period of this week. I also count on @yuvipanda to provide Georgina with any help she might need.
Thanks @damianavila. Can engineering forecast a date when the change will be deployed? Or will there be a progress milestone prior to 2i2c's capacity to forecast the deployment? Merced needs a date to plan proactive messaging for their user base.
@colliand, since I will be away half of Thursday and Friday, I will do my best that until I then, the following milestones, based on Yuvi's feedback, will be ready for them to test:
jupyter/scipy-notebook
image and and a the rocker/binder
image for a R oneDoes this sound ok?
Thanks @GeorgianaElena. I add @schadalapaka to the thread so that she can follow along with our progress. I expect she and her colleagues will like the plan to launch a staging hub and use it for testing prior to deploying big changes to production. Sarvani, let's aim to complete the testing of the staging hub on October 9 (more ambitiously on October 6). If that all looks good, let's pencil in the plan to deploy to production on October 12. It's never a good idea to deploy big changes on Friday!
Hi All - Thank you for the notes. I have reached out to our instructors to find out their availability to test this implementation on staging hub from Oct 9-11. I will share updates as I have them.
Regards Sarvani
Hi All -
Oct 9- 11th works for us to test. However, we request you move implementation after 9:30 am PT on the 13th to avoid conflicting with a course final exam. In addition, when it is ready, please share instructions on how we can access the staging hub.
Hope this helps, please do not hesitate to let me know if you have any additional questions at this time.
Regards, Sarvani
Oct 9- 11th works for us to test. However, we request you move implementation after 9:30 am PT on the 13th to avoid conflicting with a course final exam.
Given this additional context, I would suggest extending the test period from Oct 9 - Oct 13, so the instructors have the whole week to test it and then we deploy it into production on Oct 16th.
I will check with our instructors. Also, question: How long do you estimate the deployment takes?
@schadalapaka, there is now a staging hub running at https://staging.ucmerced.2i2c.cloud/hub/spawn, which allows the option to choose between a Python or a R user environment.
Choosing the Python image, will launch the users into JupyterLab, whereas choosing the R image will redirect them to RStudio after the server is started. Does this sound ok @schadalapaka, or would you like to use JupyterLab as the default redirect for both options, and then let the users choose to launch RStudio from there, like it's happening right now?
Hi All-
I think a jupyterlab with a default redirect for both options, and letting users choose between Python or R image would be good.
Thanks @schadalapaka. Just note that I believe having the users that want to use the R profile be redirected to JupyterLab instead of RStudio might cause some confusion. This is because from the Lab interface you can choose to start RStudio, but you can also choose to start a Python notebook. Working in a Python notebook from the R server is possible, but there will be less packages available for the users and they might end up confused about this.
Btw, this is how the workflow of choosing between the two profiles on the https://staging.ucmerced.2i2c.cloud hub looks like.
https://github.com/2i2c-org/infrastructure/assets/7579677/34e2e63a-6c2b-4a71-ad66-ee32fb5258dc
@schadalapaka yes the staging hub is ready for your users to start testing, using the link from https://github.com/2i2c-org/infrastructure/issues/3188#issuecomment-1747131261
Hi All - From our users: Testing out the image and one big problem: notebooks do not save as .pdf files. This is beyond the usual problem that files with documentation don't save (I think someone hasn't added the correct font), they don't save at all. I've attached the error screen.
Please let me know if this is an expected behavior.
Regards, Sarvani
@schadalapaka Can you share (privately to support@2i2c.org if necessary) the notebook that caused this? I just tried a simple notebook (just print("Hello world")
) and it was able to export to PDF just fine.
@schadalapaka ah, so I can reproduce this if I'm using the R image. PDF works fine in the Python image. Maybe starting RStudio by default from the R image may reduce this confusion? Do your users use R from JupyterLab?
@schadalapaka the 13th is tomorrow and I'm not sure on what to do, but I see three short term choices available:
Hi All - Thank you for the notes. Apologies for the delayed response. I just got back from work travel.
Yes, I think starting RStudio by default would reduce the confusion.
I think the next best step is for Yuvi to make the changes in the staging hub. We can then have our users test it for a few more days before deciding on a new deployment date. I'm worried that making this change on the production server tomorrow won't give our users enough time to test it properly.
Please let me know if this works and/or if there is any concerns with this approach. Do let me know if there is any additional information I might provide at this time.
@schadalapaka, allowing for the users to test this more thoroughly and accommodate to the changes makes sense.
I've updated the staging hub to start in RStudio by default when choosing the R image.
Take the time to see how this works and please let us know when you know the new deployment to production date so we can plan around it.
Hi All - Our researchers have said that this Friday(10/20) after 9:30 am PT would work best for them. Please let me know if that works for you all to deploy these changes.
Another note: At this time, there doesn't seem to be an easy way for users to switch between python and R images easily. Once we choose an image, we have to break the urls to be able to stop the server and only then will we see an option to switch to another image or select between the images. Would it be possible to fix this?
Notes from our staging hub testers:
My notebooks, most likely because they're .ipynb files, appear to load as code rather than R scripts
The notebooks can load into RStudio but the original R code is buried under a mess of formatting. Is this what's going to happen to all our notebooks when we switch over?
My notebooks, most likely because they're .ipynb files, appear to load as code rather than R scripts
Going back to the pilot hub, attempting to open an .ipynb notebook in RStudio gets the "File is binary rather than text so cannot be opened by the source editor" error. I am worried that if we go over to this image we will lose access to all our previous Jupyter notebooks.
@schadalapaka let me summarize current state of issues:
.ipynb
files in RStudio doesn't really work. This is expected, as RStudio uses a different notebook format (.Rmd
) not .ipynb
files. Our earlier assumption was that most of the R users were using RStudio (and hence .Rmd
files or .R
files), not JupyterLab with .ipynb
files. Is this inaccurate? Do your R users want to use both RStudio and JupyterLab for R? Or perhaps they don't want RStudio at all? So the question here is: "Do R users want JupyterLab with R as well as RStudio with R? Or only JupyterLab with R? Or only RStudio with R?"Once we choose an image, we have to break the urls to be able to stop the server and only then will we see an option to switch to another image or select between the images.
Unfortunately there really isn't. This is why we recommend separate hubs for R and python for most cases. This is the primary confusion that users will run into. We can allow users to have multiple servers running at the same time, but in our experience, especially for educational use cases, this usually causes more confusion, not less. Multiple hubs at different URLs is often the way to go, and where I hope UC Merced eventually goes. My understanding is that this image selection is a temporary fix, as the contract probably needs to be different for multiple hubs.
Going back to the pilot hub, attempting to open an .ipynb notebook in RStudio gets the "File is binary rather than text so cannot be opened by the source editor" error. I am worried that if we go over to this image we will lose access to all our previous Jupyter notebooks.
This is because the pilot hub's RStudio version is really old. And also (1), where .ipynb
files can not be opened by RStudio. Users won't lose access to any data! We can also continue to keep the old image as an option as well, although I'm worried that will lead to more confusion (see (2))
What do you think about a 30min call tomorrow Oct 18 9AM pacific (or earlier, if possible - I'm currently traveling in India, and @GeorgianaElena is in the EU) to clear things up?
I can also be available for a call with @schadalapaka. This experience of R versus Python user confusion was anticipated and is a main reason why the multi-hub approach is better than image select drop-down for education scenarios. Thanks @yuvipanda for your generosity and clarity.
Just heard from @colliand on slack that nobody at UC Merced is using RStudio! So earlier decision to move default to RStudio was the wrong call. Instead, we should move the default back to JupyterLab, and fix the PDF generation.
Default back to RStudio? I'm confused by what @yuvipanda wrote above.
@colliand sorry, I menat default back to JupyterLab. Edited to fix.
https://github.com/2i2c-org/infrastructure/pull/3290 moves the hub back to using JupyterLab as the default interface, and https://github.com/2i2c-org/infrastructure/issues/3289 tracks fixing PDF generation in jupyterlab in the R image. I've opened https://github.com/rocker-org/rocker-versioned2/pull/714 upstream to fix that.
So to recap:
@schadalapaka ok, so everything except PDF conversion in R is ready for testing again. That should be hopefully sorted in a day or two.
Thanks Yuvi for highlighting item 3 above: Users wanting to use RStudio specifically can still launch it explicitly if needed.
@schadalapaka @colliand we worked with upstream (https://github.com/rocker-org/rocker-versioned2/pull/714) and now PDF generation works fine from inside Jupyter as well! So from our perspective the staging hub is now good to go - please keep us posted on when we can flip the switch.
I love that feedback from Merced to 2i2c identified an upstream bug and that 2i2c has deployed upstream changes to fix the big! Thanks Merced and 2i2c engineering for making the open source ecosystem better.
Hi Everyone -
All-Clear for 2i2c to deploy changes to production after Friday 9:30 am.
Context
UC Merced is working through a pilot with a 2i2c operated education hub service. The pilot involves courses that use R and Python. The software image that 2i2c uses to delivery R and Python is bloated and difficult to keep up to date. The situation is reminiscent of prior experience with the University of Toronto. Merced may wish to move over to multi-hub service after the pilot but wants to complete the current engagement using a single hub. The 2i2c team discussed some scenarios and shared recommendations with Merced.
Proposal
Change from the current single R + Python image to offer an image selector after login with a menu offering Merced hub users to choose between two images: an up-to-date Python-focused image and an up-to-date R-focused image.
Merced personnel have been notified that the menu is likely to create some confusion among hub users. Some students intending to work on Python related work will mistakenly select the R image, etc. Merced is developing communications and local support plans to address these risks. Merced requests that 2i2c provide a precise data and time when this change will be deployed. Merced asked if this change could be done during the week of October 2 or October 9. @colliand indicated he'd respond with a precise date pending a capacity review by 2i2c's Engineering Team. FYI @damianavila.
Updates and actions
No response