openedx / edx-platform

The Open edX LMS & Studio, powering education sites around the world!
https://openedx.org
GNU Affero General Public License v3.0
7.36k stars 3.85k forks source link

Move edx-sandbox requirements to codejail-includes #34375

Open kdmccormick opened 6 months ago

kdmccormick commented 6 months ago

Context

"edx-sandbox" is the environment in which user-authored edx-platform code is safely executed using the codejail library. Today, the edx-sandbox environment's requirements are installed from edx-platform.

One of its requirements is codejail-includes, which contains code that was formerly stored in edx-platform. We extracted it out so that the edx-platform and edx-sandbox Python versions could be managed independently. Please note: the standard edx-platform environment also installs codejail-includes so that user-authored code can be unsafely executed in tests and development.

We find ourselves wishing that we could offer multiple versions of the edx-sandbox environment. This would allow site operators test out, jump ahead, or lag on edx-sandbox updates (probably just Python and library upgrades, but maybe one day API changes) separate from their regular release upgrades. It would allow us to give site operators advance notice on edx-sandbox environment changes and ask that some operators try out new edx-sandbox environments ahead of named releases.

We see two ways to do this:

  1. Stick more .in files into the edx-platform edx-sandbox requirements folder. Operators would choose different edx-sandbox versions by pip-r-installing different .txt files in that directory.
  2. Move everything from edx-platform edx-sandbox requirements to be requirements of codejail-includes, allowing us to get those edx-sandbox requirements lists out of edx-platform. Ideally, the list in codejail-includes should be a new .in file, so that it is clear that these packages are not required by the actual code within codejail-includes. The generated .txt file would be included into edx-sandbox's base.in. That way, operators would choose different edx-sandbox versions by pointing at different versions of codejail-includes.

I think 2 is a more elegant solution.

Tasks

TBD

kdmccormick commented 6 months ago

FYI @feanil

timmc-edx commented 6 months ago

A couple questions:

kdmccormick commented 6 months ago

Oof, yeah, that readme needs some help 😂

Background: Instructor-authored code executed in edx-platform's codejail environment (aka edx-sandbox) can import from the standard library as well as a set of third-party packages, listed in the edx-sandbox folder. Some of that is third party stuff like scipy and numpy; some of it is custom Open edX utility functions. The custom stuff used to be distributed throughout edx-platform in a concerning way, so a couple years ago, we exracted all the custom code into codejail-includes, which is now installed as an edx-sandbox requirement.

The proposal here is to go a step further and move the edx-sandbox requirements list itself out of edx-platform and into codejail-includes' install_requires list. edx-sandbox would still install those requirements, but it would do so transitively through codejail-includes. This would be great for external codejail services, since they could just install codejail-includes rather than duplicating the whole edx-sandbox requirements list from edx-platform.

timmc-edx commented 6 months ago

Got it, thanks. And yeah, sounds like option 2 would be best.

kdmccormick commented 6 months ago

@feanil decided to go with (1) for Redwood just because it's a previously-trodden path and we need to hustle, but I'll keep this issue open as a good future improvement.

feanil commented 6 months ago

That makes sense to me @kdmccormick

kdmccormick commented 5 months ago

solution (1) is here: https://github.com/openedx/edx-platform/pull/34509

I'll leave this ticket open in case someone has time to work on solution (2). I have no immediate plans to, but I could see it being valuable in the future next time we invest in codejail improvements.