Representer Updates and Golden Tests

BethanyG commented 11 months ago

This appears at first glance to be a horrifying amount of files changed. Don't be alarmed. 😉
The bulk of the "changes" are for the 167 new Golden tests that required "recording" :

mapping.json,
representation.json,
representation.out, and
representation.txt

for each test, as well as the creation of a code file and a config.json for each test. which adds up to ~1, 002 files that only need "spot checking", if any checking at all.

There are 10 example tests, 22 concept exercise tests, and 135 practice exercise tests. The example tests try to get directly at normalization "issues", whereas the rest of the tests are using exemplar/example code copied from the content repo.

The remaining files are the other non-test case changes listed below. I am least confident in the CI/workflow changes, and could use a going-over on the shell scripts as well. 😄

Due to the changes in normalization and representation.txt, all representations for Python will need to be re-run.

[x] Added Golden Tests for example, concept and practice exercise groups.
- [x] Added config.json files for each test.
- [x] Added Golden Files for mapping.json, representatin.json, representation.out, and representation.txt
[x] Added Test script for Golden Tests (test/test_representer.py)
[x] Added new run-tests-in-docker.sh script
[x] Added CI workflow to run Golden Tests on PR Push
[x] Added representation.json metadata output file (edited representer/__init__.py)
[x] Removed astor library dependency and converted to using Python AST module (edited test/utils.py)
[x] Added test directory and files to .dockerignore
[x] Changed docker image to use python:3.11.5-alpine3.18
[x] Edited run.sh and run-in-docker.sh scripts
[x] Changed representation.txt to be an AST string
[x] Changed normalizer.py to remove __main__ blocks and print() statements.
[x] Changed normalizer.py to handle generators.
[x] Changed __init__.py to create AST without typehints and without indentation.

There are probably more small changes I've missed due to the fact that this took a really long time to complete. Please let me know if you have any questions or issues.

I will follow up with a small PR to change the README after merge, as well as to add representations.txt to the Python content repo.

BethanyG commented 11 months ago

This looks great! One small question: does the test runner handle multiple submitted files? As in: if a student submits both leap.py and leap_helper.py, will both files be used in the representation?

Good Question. Short answer: no on the representation, ~~ambiguous~~ no on the test runner. Does the toolchain now need to account for multiple files?

Except for 'Cater Waiter', where we provide a helper file that is also present in the editor, ~~I would have to test to make sure of the test runner behavior.~~ We don't do anything explicit to process multiple files, so I think that means we don't (fully) account for multi-file solutions.

I think that as long as the test runner/Python can import the data in question, then it can run the solution just fine -- but none of the config files for exercises list anything but the <exercise-slug>.py (not a pattern) file as part of the solution, so I don't know what actually gets copied into the container.

So I thought in the past that the whole solution folder was copied, but no longer. So the import scenario from above no longer works -- the runner does not process multiple files, and neither does the rest of the toolchain. The one exception is the (exercisim-provided) 'Caiter Waiter' data file. But its not considered "part" of the student solution - only required for it (so already in the Docker container).

So even if a multi-file solution can be tested (and only in the context of importing into the main file, not the file itself), it can't be fully represented right now. Nor would the Analyzer "see" the additional file as part of it's workflow.

~~But let me do some testing today to be really clear.~~ I do think formally handling multiple files might take some work, but maybe not a whole lot. Should I put that on my to-do list? 😄

ErikSchierboom commented 11 months ago

I do think formally handling multiple files might take some work, but maybe not a whole lot. Should I put that on my to-do list? 😄

That might be a good idea :) It's not high priority or anything, as the vast majority of solutions will not add new files, but in theory it is possible. I expect it to be more likely for more complex exercises, where extracting code to different files can help structure things a bit nicer.

Except for 'Cater Waiter', where we provide a helper file that is also present in the editor, I would have to test to make sure of the test runner behavior. We don't do anything explicit to process multiple files, so I think that means we don't (fully) account for multi-file solutions.

Yeah, so the logic to use probably looks something like this:

Find all .py files
Exclude the following files (which the student should not modify):
- Test files
- Editor files
- Invalidator (I don't expect there to be .py files in here, but just in case)

Do you want to have a go at adding this before we merge this?

BethanyG commented 11 months ago

Sure - I can give it a go! 😄 I'll need to do the runner and analyzer as well. But I think I will save the Analyzer for later (I have a bunch of stuff to do for it -including using Alpine and making golden tests- so its going to be a while).

BethanyG commented 8 months ago

Alrighty! I think were good to go...but maybe I will have you merge this, so that I don't automatically trigger something catastrophic?

I renamed the job to test, and judging from the CI the world didn't end. 😉 But let me know if there are additional edits needed. I will be around. 😄

ErikSchierboom commented 8 months ago

Just to be clear: this will change the representation output and thus require the existing representations to be re-run?

BethanyG commented 8 months ago

Just to be clear: this will change the representation output and thus require the existing representations to be re-run?

Yup. It will. 😢 Specifically:

Changed representation.txt to be an AST string --> changes the representation from normalized code to AST
Changed normalizer.py to remove __main__ blocks and print() statements. --> changes how solutions are compared, since more things are removed during normalization.
Changed normalizer.py to handle generators. --> solutions with generators now can be represented (they weren't being fully processed)
Changed __init__.py to create AST without typehints and without indentation. --> _changes the way the AST is produced__
Added representation.json metadata output file (edited representer/__init__.py) --> _adds representer.json for all solutions.
And we re-ordered things in representer.out --> normalized code is now at the top of the file.

Edited to add: Since we've replaced astor with ast.unparse (and thereby fixed the pattern matching issue), that will also change all solutions that use pattern matching (or any other feature that wasn't supported by astor).

ErikSchierboom commented 8 months ago

We'll re-run the representations shortly (it'll take a while).

exercism / python-representer

Representer Updates and Golden Tests #45