berkeley-dsep-infra / datahub

JupyterHubs for use by Berkeley enrolled students
https://docs.datahub.berkeley.edu
BSD 3-Clause "New" or "Revised" License
62 stars 37 forks source link

[Inquiry] Can Datahub Be Used For Large Class-Size Examinations? #3048

Closed cdbeon closed 2 years ago

cdbeon commented 2 years ago

PH 142 is currently looking into how feasible it is to hold our final exam online on Datahub. Our current plan is the following:

  1. Send out a compressed file containing the exam Rmd as well as a dataset in csv format
  2. Students will simultaneously complete using the data provided. This will be done on Datahub (or local RStudios, if they choose to do so)
  3. Students will knit their file and submit their exam onto Gradescope

A few additional details/logistics about the course and exam:

Given this information, would it be possible to hold said exam with the current resources available? Would be happy to provide more information if needed.

cdbeon commented 2 years ago

@corinne-riddell @kmaccuish

balajialg commented 2 years ago

Thanks, @cdbeon for sharing the detailed requirement! Posting slack response below for the sake of continuity

I guess provisioning more nodes during 3 hours of the exam is something we can definitely do at our end. We did try the synchronous use case with the Political Science 3 course, where 300+ students worked on R notebooks during the class hours. We had some hiccups initially but were able to make it work through few workarounds.

Let's discuss the way forward in this thread!

felder commented 2 years ago

@cdbeon This sounds like something we should totally be able to accommodate using the scheduling calendar to increase node provisioning during this period or perhaps from 6-11 just to give a bit more wiggle room.

It may also be helpful if students are encouraged to sign in a little ahead of time (if they can) so we don't have 340 folks all hop on at once at 7pm. However, I think if we go for the scheduled increased node provisioning route it'll probably still be ok.

Also you mention the students use publichealth hub currently, but that the exam would take place on datahub? Is this right?

@yuvipanda do you think maybe we should also provision more cpu for each student during the exam period just to make sure there are no permutation hiccups?

cdbeon commented 2 years ago

Forgive me, I thought Datahub was just the catch-all term for all the hubs.

This exam would take place on the Public Health Hub.

felder commented 2 years ago

@cdbeon just wait until the "Datahub" building is constructed!

https://statistics.berkeley.edu/about/news/largest-gift-berkeleys-history-data-hub-building

ryanlovett commented 2 years ago

@felder Fortunately it is now called the Gateway Building. I guess that creates an issue for Network Operations however. :)

yuvipanda commented 2 years ago

@felder Someone should run through the expected notebooks on datahub and see CPU / memory usage. We need to increase CPU limits only if they are planning on using a lot of CPU

balajialg commented 2 years ago

@cdbeon Whenever the teaching team finalized the decision wrt Datahub and are ready with the notebooks, Can you share the notebooks with us for evaluating CPU/RAM requirements? Few days' heads up prior to 12/13 would be appreciated.

felder commented 2 years ago

@balajialg @yuvipanda I've added an entry into the scaling calendar for 12/13 from 6-11 and specified 6 extra nodes in the beta pool for this.

I also moved the evening cooldown entry to after 11.

balajialg commented 2 years ago

@felder Thanks for looking into it! Did you get any confirmation from the teaching team or @cdbeon that they will use the hub for their exams?

felder commented 2 years ago

@balajialg no

balajialg commented 2 years ago

@cdbeon ASAp, Can you confirm whether your plan to use datahub as part of this examination is in scope? If not, we can revert back to the original setting

cdbeon commented 2 years ago

Hello! Sorry about the delay, yes we will be using Datahub as part of the exam. However, the amount of computation that the students are required to do will be much lower than we originally expected.

balajialg commented 2 years ago

@cdbeon That's good to know! Let's keep @felder's allocated compute as is during the exam time to ensure that the odds of the hub struggling to scale up is pretty low. Thanks!

felder commented 2 years ago

@cdbeon how did the exam go last night?

cdbeon commented 2 years ago

It went smoothly! No complaints about the hub crashing or lagging. Thank you all so much!

felder commented 2 years ago

Great! With that, I'm going to close this issue.