Should this track use test generators?

cmccandless commented 5 years ago

Pros:

Consistent test format
(Generally) Much easier to add new tests from the canonical data

Cons:

Some exercises don't have canonical data (although this number is shrinking rapidly)
Creation of test generators for new exercises/exercises without generators can be more complicated than even implementing the exercise itself.

Discussion commence... now.

TL;DR:

Test generators are (probably) a good idea
Use Jinja2 templates for code generation
Support exercise-specific templates.

yawpitch commented 5 years ago

Is there a working example of this for another language track?

Generally I'd say yes it would seem sensible to generate tests from the canonical data ... that said, with Python's rather rich and diverse -- and at times infuriatingly half-sensical -- set of assert statement variants, I foresee a great deal of difficulty reliably mapping canonical data into a generated test suite that wouldn't need sometimes substantial manual intervention after the fact.

cmccandless commented 5 years ago

Is there a working example of this for another language track?

Absolutely. The two I am more aware of are the C# and Ruby tracks.

I foresee a great deal of difficulty reliably mapping canonical data into a generated test suite that wouldn't need sometimes substantial manual intervention after the fact.

You're not wrong. The solution to this is to have a single base generator that might work for \~half of the exercises, then to extend/override the base generator for the exercises that require a more custom solution (example in my fork), or to just create individual generators from the start (probably overkill). This requires a heavier initial workload, but makes adding/modifying test cases trivial because one only needs to run the appropriate generator and verify that the generated code behaves correctly.

yawpitch commented 5 years ago

Right, well looking at your example, an initial thought is that all the code generation is a good case for Jinja2. Once you've got a template for each of the various forms of assert, you can use the tools in Jinja2 to combine them and render them. Much simpler than directly writing all those newlines.

cmccandless commented 5 years ago

I like it! I don't have the time to dive into that myself right now, but here's my thoughts:

Use Jinja2
Have a central generate.py <exercise> script that uses either:
- An exercise-specific template, if available (hello_world.tmpl)
- A general-use template (base.tmpl)

It would be really nice to be able to have that generate.py and base.tmpl implemented in the next week or so. That way, there would be a whole chunk of issues for valid base.tmpl for <exercise>, or implement <exercise>.tmpl ready for Hacktoberfest contributors!

yawpitch commented 5 years ago

I'm not certain I'll have enough time free to dedicate to this in the next two weeks, but maybe.

What are the constraints, though? Specifically what version of Python would this need to be targeted to? And how freely can dependencies be added? I see a requirements-travis.yaml, but nothing else. The README indicates that the tests themselves need to run in 2.7 onwards, but does the test generator?

I'd consider a simple, leverage-what-we-can workflow to be generate the file from a general template that imports another template with Jinja2 macros defining the various blocks for specific kinds of tests, rendering that to disk but don't worry about formatting. Then kick off an autoformatter like Black to handle line length, etc. But Black, for instance, requires Python >= 3.6.

yawpitch commented 5 years ago

If we don't care about the autoformat idea, then Jinja2 itself is backwards-compatible, but we could quite easily get into some overlength lines and just uggo tests.

cmccandless commented 5 years ago

I would be open to autoformatting, and I don't think requiring Python >= 3.6 is unreasonable. I am not familiar with Black specifically, but it looks solid enough to count on; can I assume it conforms to PEP8?

yawpitch commented 5 years ago

In their own words Black uses "a strict subset of pep-8"; it's a bit more opinionated than pep-8 (for instance it prefers double quotes to single quotes unless there are double quotes in the string itself), and it does a few things that flake8 for instance disagrees with.

Though maybe we should start with the Jinja2 part of things, and look at autoformatting later. When it comes down to it and glancing through a number of canonical data files we're going to have our hands full just generating in any format.

cmccandless commented 5 years ago

Though maybe we should start with the Jinja2 part of things, and look at autoformatting later.

Fair enough. Having looked at test generation in the past, I would propose that there are 3 categories of test suites:

Stateless: no state information is used, test format is as simple as module.property(**input) (Ex: hello-world, bob, two-bucket)
Stateful: requires instantiation of some user object; slightly more complicated format (Ex: simple-cipher, kindergarten-garden)
Other: involved some sort of list of operations; very difficult to generate test code (Ex: react, zipper)

yawpitch commented 5 years ago

Yeah, there's just no obvious indication in the canonical data which is which.

For instance for hello-world the "exercise" would map to the module name and the "property" to a function to be imported from that module.

But for allergies the "exercise" could still map to the module name, but also a class within that module, and the "property" in this case would (via string transformation to conform to pep8) to a method on that class.

There's no way to know, programatically, which type of generator would be required. Which sort of implies that the default template should be for the stateless variety and you'd need to special case all the stateful. And for "other", I just dunno.

cmccandless commented 5 years ago

Which sort of implies that the default template should be for the stateless variety and you'd need to special case all the stateful.

That's the conclusion I reached as well. Stateful could also have a generic template, of the form:

def test_stateful_format(self):
    cls = getattr(module, module.title().replace('-', ''))
    obj = cls(**input)
    self.assertEqual(obj.property(), expected)

cmccandless commented 5 years ago

I started on the generic template and generator script here.

yawpitch commented 5 years ago

Hah! Too fast man! I'm working on one too!

cmccandless commented 5 years ago

I wasn't sure if you were going to start working on it immediately or not; I don't mind taking a back seat in this one! Feel free to hijack anything from my work in progress that you think would be useful for yours.

cmccandless commented 5 years ago

@yawpitch Perhaps it would be wise to create a WIP PR for this?

yawpitch commented 5 years ago

Will do; but I won't be able to work on it till later tomorrow. M On Sep 28, 2018, 13:42 +0100, Corey McCandless notifications@github.com, wrote:

@yawpitch Perhaps it would be wise to create a WIP PR for this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

yawpitch commented 5 years ago

Got a bit of time to take a pass at this today; still getting up to speed with what all is in the canonical data format, and not sure how to handle conversion of named inputs to positional / keyword arguments on any sort of a reliable bases. But it's early days.

cmccandless commented 5 years ago

not sure how to handle conversion of named inputs to positional / keyword arguments

A simple solution would be to treat all inputs as keyword arguments.

yawpitch commented 5 years ago

Not ideal. Forces the students to use the same signature and parameter names in their code as the canonical data author, which are sometimes less than sensible (Sum of Multiples comes to mind)... but may be the best choice. Also eliminates the problem of the input being defined in JSON in a type that has no inherent order in Python.

Doing some more on this tonight, after having a longer look at the canonical data schema.

M On Oct 1, 2018, 14:33 +0100, Corey McCandless notifications@github.com, wrote:

not sure how to handle conversion of named inputs to positional / keyword arguments A simple solution would be to treat all inputs as keyword arguments. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

cmccandless commented 5 years ago

Also eliminates the problem of the input being defined in JSON in a type that has no inherent order in Python.

:+1:

yawpitch commented 5 years ago

Opened a new PR (#1547); this one is actually functional and will create tests, plus will auto format them to pep-8 and will verify that the code is viable ... this revealed some additional issues with using the dict in the canonical data to present keyword arguments (POV for instance uses a Python reserved keyword as a name in the input, and ETL uses numerical keys, which are illegal names to pass into a function call).

Basically I stopped doing so much work to update the incoming dict and instead made the text processing filters available to the templates ... this should make the work of building out templates that need to approach the data differently (ie teasing out a class name and instance methods) easier.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as abandoned because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as abandoned because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

cmccandless commented 3 years ago

The test generator has been in place for quite a while now; this can be closed.

exercism / python

Should this track use test generators? #1513

TL;DR: