Lack of clarity in Python testing

kenlyle2 commented 5 years ago

At https://exercism.io/tracks/python/tests , it's not clear to me the point of the test files...Yeah, you can run pytest with the test file provided with the exercise, but there seems to be no naming convention or mechanism for relating the tests to the implementation under test.

For pangram, for example, I could cut and paste the sentences under test to my program, but I am sure that's not the idea.

Is it possible there is a sentence or two missing? It seem like the tests need the implementation filename as input, or that the test fle needs to be imported into the implementation, or similar. Clearly, I am missing something, and would appreciate any guidance.

jonmcalder commented 5 years ago

I'm not sure if I'm misunderstanding your question, but if you look at line 3 of the test for pangram, you should see that the test is specifically importing the implementation of is_pangram() from the pangram.py file.

The same pattern exists in all the other exercises and that is the "mechanism" for relating each test to the implementation for each exercise.

Grociu commented 5 years ago

What happens when you run pytest pangram_test.py. Are the test executed? Do you get a report from the tests that ran?

kenlyle2 commented 5 years ago

is_pangram() is an empty function that only contains the word "pass". Are we supposed to do our implementation in the _test file? I am still not seeing the connection. I have done a Robot Framework project that worked out pretty well, but no Python testing, so try to remember being noobish.

How does running tests on the sentences in pangram_test.py relate to my implentation file, mypangram.py?

It seems like a really simple question...the command given, like pytest pangram_test seems to be testing some sentences, and asserts whether those should return True or False, but how does that relate to my implementation?

kenlyle2 commented 5 years ago

What happens when you run pytest pangram_test.py. Are the test executed? Do you get a report from the tests that ran?

10 failed in 0.29 seconds

Grociu commented 5 years ago

When you download the exercise you get three files. The README.md which is self-explanatory, the test file, which is a Python script to test your exercise implementation and an exercise stub. That's the pangram.py file that had the pass instruction. You are supposed to replace the pass with your implementation in the exercise stub. As described by @jonmcalder the test file references the exercise stub with from pangram import is_pangram As you implemented your solution in a file named mypangram.py, and that file is not referenced in the test file, the function is not imported to the test file, therefore it is not tested. I'm assuming the exercise stub still exists as you're not getting an error when running pangram_test.py I recommend you try two things. First, change the pangram_test.py to import from mypangram.py instead. You'll then be testing your implementation, and at the same time get some practical knowledge about what I described. Then, second, if you undo the changes to pangram_test.py and implement your solution in the exercise stub filepangram.py, and then run the tests, you'll have completed the exercise the intended way. TLDR: It's the file names.

kenlyle2 commented 5 years ago

Ah, perfect! I knew there was a simple answer. Thanks for bearing with me. I guess I was just in a hurry to jump in, and missed that part about replacing the "pass" string. Thanks!

kenlyle2 commented 5 years ago

Update : 10 passed in 0.17 seconds !!!

kenlyle2 commented 5 years ago

I suggest adding something like the following above the current Exception Messages heading :

Your Mission:

Replace the word "pass" in pangram.py with the body of the function stub which receives a sentence as its one argument. The function should return True or False, indicating that the sentence passed is or is not a pangram.

yawpitch commented 5 years ago

Broadly speaking we’re using Test Driven Development principles — although we’ve written the tests for you — and so part of the Fun is reading the errors that crop up when the tests fail.

The pass statement is there to keep the empty “slug” file from throwing a SyntaxError before the tests can even run. Though there are many exercises, IIRC all have been implemented so you’ve got enough of a “slug” that the tests will run to completion but all tests will fail. This is actually a big improvement over the situation with many other language tracks, where no slug is provided at all.

From the Pangram slug you need only run the tests to encounter the first failure, which will be that False is not returned when “five boxing wizards [...]” is passed into your function. You could respond by putting return False at the bottom of your function, and you’ll pass that test but fail on “Five quacking Zephyrs[...]", and so on.

Via this iterative method you quickly learn the outline of what your implementation needs to do at a minimum. This is a helpful set of skills to acquire and I’m not sure that making the documentation more clear about the implementation (as opposed to about the problem) is really helping you with that.

It’s my hope that everyone starts to read the test suite and understand the constraints they’re trying to meet before they start implementing a solution, as that’s exactly what you’d do when using TDD principles in a work environment.

kenlyle2 commented 5 years ago

Thanks, all! This is all nominally helpful, but remember, this course seems aimed at newbies. For context, this pangram assignment is really proximate, logically, temporally, and on the page to the "Hello World!" assignment. Would it be kind maybe to at least say a few words about how the course has adopted the TDD philosophy? And minimally explain that the code has to "return" the results that match the tests? It's more a wall than an onramp at this point.

On Tue, Jun 18, 2019 at 5:15 PM Michael Morehouse notifications@github.com wrote:

Broadly speaking we’re using Test Driven Development principles — although we’ve written the tests for you — and so part of the Fun is reading the errors that crop up when the tests fail.

The pass statement is there to keep the empty “slug” file from throwing a SyntaxError before the tests can even run. Though there are many exercises, IIRC all have been implemented so you’ve got enough of a “slug” that the tests will run to completion but all tests will fail. This is actually a big improvement over the situation with many other language tracks, where no slug is provided at all.

From the Pangram slug you need only run the tests to encounter the first failure, which will be that False is not returned when “five boxing wizards [...]” is passed into your function. You could respond by putting return False at the bottom of your function, and you’ll pass that test but fail on “Five quacking Zephyrs[...]", and so on.

Via this iterative method you quickly learn the outline of what your implementation needs to do at a minimum. This is a helpful set of skills to acquire and I’m not sure that making the documentation more clear about the implementation (as opposed to about the problem) is really helping you with that.

It’s my hope that everyone starts to read the test suite and understand the constraints they’re trying to meet before they start implementing a solution, as that’s exactly what you’d do when using TDD principles in a work environment.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/exercism/python/issues/1827?email_source=notifications&email_token=AA74ZK77QOSPCYZN7CLH2SLP3FUCRA5CNFSM4HZDRKXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYAHHOI#issuecomment-503346105, or mute the thread https://github.com/notifications/unsubscribe-auth/AA74ZKYREVOJPEAHE45QDCTP3FUCRANCNFSM4HZDRKXA .

yawpitch commented 5 years ago

I’m sorry, but what exactly distinguishes pangram for you in terms of difficulty of the student experience from hello-world, which unlocks it?

I’m asking because I haven’t had the experience of starting out completely fresh on the Python track on the site. Can you explain precisely what was different for you in your approach to passing hello-world vs pangram?

The hello-world exercise also has a slug file, it also has a dummy function in place that at download time does nothing but pass, the tests also must import that function, and the student also needs to get that function to return the value required by the tests in order to pass.

It does not take an argument, which is one significant difference, but based on the discussion above if you’d renamed the slug to myhello_world.py you’d have seemingly run into the same problem you’ve described in the same order.

I’m not against providing some more instruction to the student, but I don’t yet see why pangram should be a point of focus. It’s unlocked by hello-world, but so are 12 other exercises, 11 Side and 1 Core. The two-fer exercise is the next Core one, and it has the same properties as pangram, so I’d have expected you to run into these difficulties there initially.

All this said is there anything in the copy on the Running the Tests that could have given you more clarity? The “implementation file you created” wording does seem potentially problematic, since most (if not all) the exercises have the slug file already in place for the user.

kenlyle2 commented 5 years ago

Hey All! The fundamental issue is that I don't believe that the idea of modifying the stub file is ever explicitly stated. Maybe I missed it. Maybe it's "obvious", in which case we can simply close this thread.

Quotng my last post : "Would it be kind maybe to at least say a few words about how the course has adopted the TDD philosophy? And minimally explain that the code has to "return" the results that match the tests?"

I am sure that it's difficult to decide what to put on the exercise pages, and what should be links, but here are some proposed improvements to the layout.

=====

Exercise Name

This exercise, consistemt with Exercism' philosophy, is structured using TDD (hyperlink). Understanding TDD is an important part of this course.

Existing Exercise Description

Assignment - Review the file ending in _test to understand what the test are looking for. Modify the existing stub file, which is nameed very similarly to the exercise, typically containing the word "pass"- Frequently, the stub file wilt be a function, which receives one or more arguments, in this case, from the tests.

In order for the tests to pass, your code has to return the values that the test is looking for. There is no Artificial Intelligence in the tests, so if they are looking for True or False, and you return Yes and No, the tests will fail.

You may need to import (link) additional libraries (link, like pypi) to complete the assignment.

Running the tests The test file will typically import the file containing your code, which is why you don't need to reference your implementation file when calling the test file.

{Existing Running the Tests text}

=====

The current layout, jumping from the problem statement to Raising Exceptions seems a little jarring and out of order/context. Maybe Raising Exceptions would be better as a link later on the page.

Currently, it' seems like a beginner recipe that says "Start with a roux"....and I think that my suggestions add clarity, improve the flow, and clarify the process.

Does that help?

cmccandless commented 5 years ago

Can you explain precisely what was different for you in your approach to passing hello-world vs pangram?

I do believe this question is the crux of it. @kenlyle2 ?

To address some of your other confusion: there are links to a few guides to learning Python that should have been presented to any track newcomer, however I am having difficulty finding those links on the site now as a track veteran. The documents can also be found on this page.

The intent of presenting these links is for newcomers to a language to have the base knowledge of syntax and workflow in a language needed to begin the track. If, after reviewing those tutorials/guides, a student is still unable to proceed, it is possible that our documentation needs some revision. It is also possible that those guides are insufficient, and additional links should be added (and any that are redundant, removed).

yawpitch commented 5 years ago

I can see the value of some additional explanation to make the initial exercises easier for a beginner to attack, as lord knows I wouldn't want to get "start with a roux" right off the bat.

That said, I'm just not sure where that should be except in the README.md attached to hello-world or the first -- and perhaps _only -- exercise it unlocks.

I do not think that peppering every exercise with a "Beginner's Guide" is a good idea... in theory anyone who has successfully passed the tests for hello-world (or at the most two-fer, the current 1st "real" Core exercise) should have garnered enough information for everything that @kenlyle2 has made explicit to be redundant. These things should be patently obvious -- and they are, for anyone with a minimum understanding of imports and testing -- but that might take us giving them a gentler introduction in a focussed manner. Adding the blurb described above to multiple exercises is the unfocussed version of that.

So, @cmccandless, perhaps we're doing students a disservice by auto-approving hello-world, or at least by unlocking anything but two-fer for anyone on the "Mentored" track? Since hello-world is approved automatically we may be giving truly beginner students too much to onboard.

To back this notion up -- very roughly -- I've just browsed through a few dozen randomly chosen "Community Solutions" to hello-world ... I'd say roughly 1-2 in 5 would fail the tests, most of those because they print instead of return, but at least a good third of the failures appear to be completely syntactically invalid and couldn't possibly be imported and run. Which means students are merely submitting something whether it works or not and getting auto-punted to an impossible task. For that subset of students the next exercise, Core or Side, would be like scaling El Capitan.

I'm not sure which would be better:

Letting hello-world only unlock two-fer, which means that the student can't progress any further while they wait for the queue to churn, but at least they won't try other exercises that would be even more frustrating than two-fer and will -- admittedly eventually -- receive the benefit of a Mentor's help.
Stop automatically approving hello-world and get an auto-mentor out right quick that can respond to these common misconceptions. Now completing the exercise can unlock the track as normal, but at least you've gone through some sort of pinch point. With well-crafted messaging we could "lead" the sort of student who can't solve this one forward?

I'd tend to lean towards the latter, but it does break the semantics of the other tracks ... a student starting in Haskell or C or Rust will also be auto-approved for a failing hello-world, it's just that being such "hard" languages the students are self-selecting themselves out of attempting those tracks until they've got the basics down. Hence they don't get the first-time programmers who have never even heard of the notion behind import and return.

The former would not break the site semantics, but would put additional pressure on the mentoring queue and frustrate students who want side exercises to attempt while they wait for a mentor.

kenlyle2 commented 5 years ago

Hey Michael!

You are right on point, and your research bears out my point that it's not clear....Hello World is often a "print" exercise, and you guys are (correctly, intelligently, and technically correctly) asking for a function that returns something....presumably Hello World. Personally, I was one of those who submitted a print.

So, yeah, it would probably be best to teach the lesson (how to use tests, and more broadly, perhaps, the theory of TDD), rather than auto approving Hello World, correct.

I agree that it shouldn't be necessary to give the same tutorials, or perhaps even linked resources on every assignment. Possibly, the current format would suffice after the first handful of assignments, but, even then, it would be helpful to state that ...write a function that returns the values expected by the tests...

Please take a look at the detailed pseudo layout I suggested earlier - please consider it a PR :).

And good luck with your cooking!

cmccandless commented 5 years ago

@yawpitch Thank you for pointing out that hello-world is auto-approved; I had forgotten that. So if I understand @kenlyle2 's issue, pangram is the first exercise he attempted that was not auto-approval; that's the difference we were looking for.

Looking at your suggestions above, I think option 1 might be best. Mentors are already getting some of the users that would be stumped by this sort of confusion as some portion of students who get past hello-world would click on the next core exercise instead of the first side exercise. We've been operating under the assumption that two-fer catches a great deal of beginners; it might benefit us to make sure more beginners in need of further explanation.

yawpitch commented 5 years ago

@kenlyle2 I have looked at the detailed snippet, my issue is that "write a function that returns the values" is heavy-handed. If you've passed hello-world and encountered two-fer you should already have this knowledge (a function can only be tested if has some effect that can be observed by the test). The theory of TDD, very simply, is "keep running the tests until they all pass; if they don't, improve your code and try again" ... this too should be known before you've gotten an approval on whatever exercise we consider the "first" one. My informal research also bears out that ~60-80% of the students have this figured out by the time they are approved for hello-world, and 100% of them have it figured out by the time they're approved for two-fer. There's a fine line between giving students a boost and stating the obvious, and I'm afraid to anyone not a completely first-time programmer all of the above is obvious. Most of the people using Exercism as a whole are not completely new to programming. Hence it's better to front load this information into the first 1-2 Core exercises rather than have it "infect" a bunch of the Side ones.

@cmccandless Yeah, the auto-approval is the causative factor ... if we had a mechanism for running that through a auto-mentor that could delay that if the submitted solution were egregiously wrong, that would be ideal, but since we're just now working on automated mentoring that's cart before the horse.

So yeah I'd suggest that maybe hello-world unlock only two-fer and maybe one or two more curated Side exercises that we could treat with kid gloves and expanded "beginner" Instructions. My only worry there, absent data, is that students who have no side exercises to work on while they await two-fer approval may simply give up out of boredom.

Sigh. Maybe we just need to find a mentor we can assign to just keeping two-fer empty for the moment? If it unlocks all the rest it becomes an obvious gatekeeper, but we'd have to keep the queue short for that to work well.

cmccandless commented 5 years ago

Maybe we just need to find a mentor we can assign to just keeping two-fer empty for the moment? If it unlocks all the rest it becomes an obvious gatekeeper, but we'd have to keep the queue short for that to work well.

At least we have the initial auto-mentor on that exercise deployed.

cmccandless commented 5 years ago

My only worry there, absent data, is that students who have no side exercises to work on while they await two-fer approval may simply give up out of boredom.

That is a concern... however, if we allow side exercises to be done by students that may not understand the exercism/pytest workflow, these side exercise submissions may go a month or more without being properly assisted.

At the time of this writing, there are 47 pending solutions for two-fer. Of the 5 oldest and 5 newest solutions currently pending:

2 would have been caught by the auto-mentor (these submissions are older than the auto-mentor deployment)
3 need additional help on how exercism/pytest work
2 might be considered optimal in Python 2 (I believe the auto-mentor was optimized for Python 3?), and would probably be approved immediately
3 might need more than 1 pass with a mentor, but understood the testing and submission process correctly.

No pending solution is older than 3 days for this exercise.

Of the two side exercises currently unlocked by hello-world, 1 (pangram) has submissions that are 100+ days old; the other (robot-name) is probably too complex to be unlocked at this stage anyway.

yawpitch commented 5 years ago

No pending solution is older than 3 days for this exercise.

That implies to me that some mentors have decided to ignore the priority queue altogether. Which would explain the backlog in “tedious-to-tutor” exercises like twelve-days.

Based on that I’d say yeah let’s have hello-world unlock only two-fer and put the energy into making two-fer a better onboarding exercise with more explicit Instruction/README copy.

I’ll try and take a whack at better language for two-fer tonight.

exercism / python

Lack of clarity in Python testing #1827