cssh-rwth / autograde

Test jupyter notebooks in an isolated environment
MIT License
9 stars 4 forks source link

isolated views #8

Closed 0b11001111 closed 4 years ago

0b11001111 commented 4 years ago

Scenario: a student is asked to implement a function for computing a e.g. a mean/median/variance of some data points from scratch without using library code. As this can be done without any side effects, an isolated view on that function without it having access to the notebook state would be handy for testing purposes (and should be easier to implement as import restrictions, see https://github.com/cssh-rwth/autograde/issues/2).

0b11001111 commented 4 years ago

Proposal by @Feelx234:

  1. execute notebook
  2. extract target code via inspect.getsource
  3. execute infered code again in new, empty context
  4. test it
#!/usr/bin/env python3
# Standard library modules.
import gc
import sys

# Third party modules.

# Local modules

# Globals and constants variables.
import inspect

CODE = """
from pathlib import Path
def foo(x='foo'):
    print(f'>>> {x}')

def bar():
    foo('bar')
"""
from pathlib import Path
import os

get_code = """
import inspect
code = inspect.getsource({})
"""

def test(code, state, target):
    # Need a lot of boilerplate code because inspect.get_source only works on files
    filename = target+"_file"
    with open("./"+filename +".py",'w') as tmp_file:
        tmp_file.write(code)
    tmp_state={}
    exec(f"from {filename} import {target}", tmp_state)
    exec(get_code.format(target), tmp_state)
    code = tmp_state['code']

    new_state = dict()
    exec(code, new_state)
    # ensure all deletions are applied
    gc.collect()

    # call target function
    new_state[target]()

def main(argv = None):

    state = dict()

    exec(CODE, state)

    test(CODE, state, 'foo')
    test(CODE, state, 'bar')

    return 0

if __name__ == '__main__':
    sys.exit(main(sys.argv))
Feelx234 commented 4 years ago

And then to avoid using imports within that function

import unittest
from unittest import mock
from unittest import TestCase

def import_mock(name, *args):
    raise ImportError

code = """
import collections
"""
class TestImports(TestCase):
    def test_direct_call(self):
        with self.assertRaises(ImportError):
            with mock.patch('builtins.__import__', side_effect=import_mock):
                import collections

    def test_exec(self):
        with self.assertRaises(ImportError):
            with mock.patch('builtins.__import__', side_effect=import_mock):
                state = dict()
                exec(code, state)

if __name__ == '__main__':
    unittest.main()
0b11001111 commented 4 years ago

Similarly to your example, autograde uses exec to evaluate code cells which accepts a dictionary with globals for the code. Iteratively mutating a global state is straight forward to implement when the code cell is executed directly. What you do instead, is storing it in a temporary file and implicitly execute it by an import statement.

I don't see a way to expose the global state to the modules in order to make it looking like one module from the inside.

Even if that was possible, which I'm not 100% sure about, it would require a full re-design of the notebook execution part while introducing lots of IO and decreasing transparency. Eventually, that solves just a niche problem, if at all, as it's not obvious to me if the mechanism can be bypassed.

As all student submissions are hand checked anyway and by looking at a solution you usually see directly, if it was done as intended or library function is called.

Feelx234 commented 4 years ago

Maybe the point i tried to make is that autograde exports each function that should be isolated into its own python file (so hat it has no knowledge of other functions when we execude that script).

It then thereby gets decorated with a with block similar to the one I explained in the second comment. This with block now ensures that no easy import calls can be made in the function body. You can't even use exec to execute an import statement. That decorated isolated function then gets passed on to the checker function.

As all student submissions are hand checked anyway and by looking at a solution you usually see directly, if it was done as intended or library function is called.

I thought the whole idea, is that there is no need to hand check them.

Feelx234 commented 4 years ago

I implemented isolation in mode in #10

0b11001111 commented 4 years ago

As all student submissions are hand checked anyway and by looking at a solution you usually see directly, if it was done as intended or library function is called.

I thought the whole idea, is that there is no need to hand check them.

This is a fallacy I should point out more precisely in the project description. autograde checks for functional correctness (to some degree) and creates a report based on that. That's it. A hand check of each submission is still required, e.g. to fix minor errors (you will notice that a remarkable share of the students fails at filling in credentials already) or to detect plagiarism and other kind of fraud.

Probably the name is misleading. autograde is a tool that assists you with grading, not to grade stuff automatically.

I implemented isolation in mode in #10

I'll comment on that separately under respective PR https://github.com/cssh-rwth/autograde/pull/10

0b11001111 commented 4 years ago

I'll leave this issue open as we may get back on that in the future. For now, I consider it low priority.

0b11001111 commented 4 years ago

Since 68b8163fc79cf8b77582d1e9a41576cdef12e251 autograde executes code in a way that Pythons inspect tools work again. As discussed above, achieving true isolation is hard and I'd like too keep it out of autograde. A potential work around is looks as follows: