dsa-ou / allowed

Check if a program only uses a subset of the Python language.
https://dsa-ou.github.io/allowed/
BSD 3-Clause "New" or "Revised" License
10 stars 6 forks source link

M269 TMA notebooks not processed #24

Closed mwermelinger closed 1 year ago

mwermelinger commented 1 year ago

With the initial %%javascript code cell, allowed doesn't report disallowed constructs, only cells with syntax errors. Without that cell, allowed works correctly.

The issue is thus due to the cell magic or to the fact that the metadata is trusted: false.

densnow commented 1 year ago

It seems that when using the transform_cell method on the concatenated source, the whole lot is interpreted as a cell. So, when a cell magic is in first cell, the rest of the source is added as a parameter to the run_cell_magic() method.

A solution is to transform the source of each cell before adding to the source_list. This has worked for me so far, I will do further testing and submit a PR.

Edit: Doing it this way means that code following the cell magic command will not be checked for that cell, in other words: cells with "cell magics" will be skipped (although skipped is not quite the right word). If we replace the cell magic command with something like None or print(42), cells which do not have valid Python such as %%javascript or %%bash will report syntax errors.

When I ran allowed through nbqa on our sample.ipynb it did not flag any errors in the last cell meaning that cell was not checked.

mwermelinger commented 1 year ago

Indeed the problem is that we concatenate all cells into one, and so transform_cell produces different results depending on where the cell magic appears. If we start the notebook with a cell

%%javascript
var head=document.getElementsByTagName('head')[0]

then the result is

get_ipython().run_cell_magic('javascript', '', 
'var head=document.getElementsByTagName(\'head\')[0]\n...')

where ... is the content of the following code cells. The code isn't checked because it's in a string passed to a function.

If the same cell comes after a syntactically correct Python cell, then the resulting transformation is

... # previous cell's Python code
get_ipython().run_line_magic('%javascript', '')
var head=document.getElementsByTagName('head')[0]
... # next cell's Python code

Since the second line is Javascript, not Python, a syntax error ensues and the whole code isn't checked.

The current sample.ipynb works fine because %%capture output appears in the last cell, so it's transformed as get_ipython().run_line_magic('%capture', 'output') and the rest of the code can be checked as normal.

If we transform each cell before concatenating them, then a cell magic 'gobbles' the whole cell and it won't be checked.

Maybe the best solution is indeed to replace IPython commands by 'no-ops' like print(42) (although magics like %timeit in assignments are trickier to handle) and let non-Python code lead to syntax error messages.

For the moment, let's simply transform each cell. That's enough to handle the only use of cell_magics in M269: %%javascript in the assessment.

densnow commented 1 year ago

I was thinking about replacing cell magics only and let transform_cell handle the line magics.

For the cell magic to work in Jupyter notebooks, it needs to be on the first line in the cell on its own with valid arguments (as far as I can tell: the usage documentation is sparse). This would make it relatively straightforward to check and replace. It would be nice to check cells with cell magics for the cases when they have valid Python, but of course it is not urgent as you say.

There are another few small changes that could be made to the code that I might as well mention here before I forget:

in line 549 of allowed.py we have

 for cell_line_num in range(1, len(cell_source.split("\n")) + 1):

This could be changed to

 for cell_line_num in range(1, cell_source.count("\n")) + 2):

so a list does not have to be created each time.

Another thing I added that is not quite right is in sample.ipynb: the markdown above code cell 3 (the cell magic) says that the rest of the code will be added to the second parameter, It seems to be the third parameter, so that needs amending.

Finally , i did not run tests.sh create so the current files in the tests folder will be out of sync with any new outputs

mwermelinger commented 1 year ago

Fixed by #25