racket / htdp

Other
93 stars 70 forks source link

Correctly count succeeded tests. #213

Closed mikesperber closed 11 months ago

mikesperber commented 11 months ago

Fixes #212.

The test-engine previously assumed that, if there were no errors, the number of succeeded tests would be equal to the total number of tests.

This is of course wrong when the tests didn't run. As this can happen now after the previous round of fixes in test execution, properly count succeeded tests.

To that end, document that we're expecting test thunks to return booleans.

rfindler commented 11 months ago

Are there backwards compatibility concerns with changing this API?

mikesperber commented 11 months ago

If somebody ever called the constructor directly, yes, in principle. However, even the test-engine itself calls empty-test-object, which works as before.

mikesperber commented 11 months ago

(I'd really like to be able to test this properly, but have no clue how to do it.)

mfelleisen commented 11 months ago

Let/s merge and see what we encounter. (I guess we all should use bsl/isl some more and switch between menu langs and #langs.)

mikesperber commented 11 months ago

CI says "raco test" failed. On my machine, two tests fail, but they also failed before the change.

shhyou commented 11 months ago

I think it's the stepper tests that are failing:

aco test: "htdp-test/tests/stepper/automatic-tests.rkt"
raco test: @(test-responsible 'clements)
test-sequence: steps do not match
   given: (Error-Result "add-test!: expects a boolean, given '(#f)\n  in: the range of\n      the 1st argument of\n      (-> (-> boolean?) any)\n  contract from: \n      <pkgs>/htdp-lib/test-engine/test-engine.rkt\n  blaming: <pkgs>/htdp-lib/test-engine/syntax.rkt\n   (assuming the contract is correct)\n  at: <pkgs>/htdp-lib/test-engine/test-engine.rkt:18:11")
expected: '(before-after (9 false (check-expect (hilite (+ 1 1)) 2)) (9 false (check-expect (hilite 2) 2)))
test-sequence: steps do not match
   given: (Error-Result "add-test!: expects a boolean, given '(#f)\n  in: the range of\n      the 1st argument of\n      (-> (-> boolean?) any)\n  contract from: \n      <pkgs>/htdp-lib/test-engine/test-engine.rkt\n  blaming: <pkgs>/htdp-lib/test-engine/syntax.rkt\n   (assuming the contract is correct)\n  at: <pkgs>/htdp-lib/test-engine/test-engine.rkt:18:11")
...

and more. But it may or may not be a test engine issue (at least the contract points to test-engine/syntax.rkt).

rfindler commented 11 months ago

Also, I think we need some test cases to make sure we don't bring back a bug like this one later on. I can help with that, but some specifics on what the test cases should be would help me.

mfelleisen commented 11 months ago

We have test cases in test-engine, and the one I noted is equivalent to the one that Daniel submitted. I am not sure why it still fails.

rfindler commented 11 months ago

I might be missing something but I think the tests we have all passed before the change in this pull request was merged. So I'm saying that we want some test case that captures one example of how this change is an improvement.

mfelleisen commented 11 months ago

No. The one I quoted did not pass. It's a manual test -- i want to view it like a student. If @shhyou is right, it passes now and that's good.

rfindler commented 11 months ago

It is good you have manual tests and you run them from time to time, but that's not what I meant when I wrote "test case". I am not counting manual stuff. You seem to be arguing the point that we don't need to add an automatic test case for the change in this PR because you have a manual one. Is that what you mean?

mikesperber commented 11 months ago

@rfindler I would like a way to run a program from the teaching languages, including the top-level processing in lang/private/teach-module-begin, and then look at the resulting state from the test-engine.

mikesperber commented 11 months ago

@rfindler I would like a way to run a program from the teaching languages, including the top-level processing in lang/private/teach-module-begin, and then look at the resulting state in the test-engine.

mikesperber commented 11 months ago

As to @shhyou's failing test - for some reason, the check-expect-created thunk returns not #f as it should but '(#f) instead. This happens only in the test suite, not in actual operation.

@jbclements, could you take a look?

I think this is an old issue. I had to work around a contract issue in the exact same place previously:

https://github.com/racket/htdp/blob/5370d9898b8112b4205b783192126c16d8f5c47d/htdp-lib/test-engine/syntax.rkt#L72

(I should note that what changed was the contract of add-test! was tightened - the previous one had any instead of boolean?.)

rfindler commented 11 months ago

@rfindler I would like a way to run a program from the teaching languages, including the top-level processing in lang/private/teach-module-begin, and then look at the resulting state in the test-engine.

That file looks like it is defining #%module-begin in the teaching languages so it would be run in programs like

#lang htdp/bsl
(+ 1 2)

Is that what you mean?

shhyou commented 11 months ago

FWIW DrRacket has quite some test engine tests in BSL/ISL and they are currently failing after this PR's changes.

jbclements commented 11 months ago

I see this message, I should have some time (15-30 minutes?) to look at it tomorrow.

John

On Dec 16, 2023, at 01:26, Mike Sperber @.***> wrote:

As to @shhyou https://github.com/shhyou's failing test - for some reason, the check-expect-created thunk returns not #f as it should but '(#f) instead. This happens only in the test suite, not in actual operation.

@jbclements https://github.com/jbclements, could you take a look?

I think this is an old issue. I had to work around a contract issue in the exact same place previously:

https://github.com/racket/htdp/blob/5370d9898b8112b4205b783192126c16d8f5c47d/htdp-lib/test-engine/syntax.rkt#L72

— Reply to this email directly, view it on GitHub https://github.com/racket/htdp/pull/213#issuecomment-1858772421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABXKOOC57IQTHVVSXYRCYLYJVSLTAVCNFSM6AAAAABAQBEQD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYG43TENBSGE. You are receiving this because you were mentioned.

rfindler commented 11 months ago

The specific bug that @shhyou linked to is this error. First, put this into the definitions window and set the language to BSL via the menus:

(check-expect 1 1)

then run the program. You'll (correctly) see the output:

Welcome to DrRacket, version 8.11.1.7 [cs].
Language: Beginning Student; memory limit: 256 MB.
The test passed!

Then, in the REPL do this:

> (check-expect 1 1)
All 3 tests passed!

and it should have said that two tests passed, not three.

mikesperber commented 11 months ago

That file looks like it is defining #%module-begin in the teaching languages so it would be run in programs like

#lang htdp/bsl
(+ 1 2)

Is that what you mean?

Yes.

mikesperber commented 11 months ago

The specific bug that @shhyou linked to is this error. First, put this into the definitions window and set the language to BSL via the menus:

(check-expect 1 1)

then run the program. You'll (correctly) see the output:

Welcome to DrRacket, version 8.11.1.7 [cs].
Language: Beginning Student; memory limit: 256 MB.
The test passed!

Then, in the REPL do this:

> (check-expect 1 1)
All 3 tests passed!

and it should have said that two tests passed, not three.

@rfindler Just pushed a fix for this.

rfindler commented 11 months ago

That file looks like it is defining #%module-begin in the teaching languages so it would be run in programs like

#lang htdp/bsl
(+ 1 2)

Is that what you mean?

Yes.

There are some tests that are along these lines in the htdp repo but if something really needs to be run in the DrRacket context, we can add more tests to existing test suites there.

rfindler commented 11 months ago

[ ... ] and it should have said that two tests passed, not three.

@rfindler Just pushed a fix for this.

Thanks!

rfindler commented 11 months ago

The next test failure is this program (in the language menu-based BSL):

(check-expect 1 1)
(check-expect 2 2)
(+ "hello" "world")
(check-expect 3 3)

The output, as a string, is supposed to pass this predicate. The output is:

0 tests passed.+: expects a number, given "hello"

which does not pass the predicate. Is the predicate wrong? Should I change the 2 to a 0? It does seem a little strange that we are printing something about tests in this program, but I'm not going to complain too much!

mfelleisen commented 11 months ago

@rfindler — I think a student deserves feedbacks about tests even if the program fails before any tests are run.

(What seems weird, is that the first two tests aren’t run. But we decided a long time ago that the program expressions run before tests run.)

rfindler commented 11 months ago

You are saying the current behavior is correct?

Robby

On Wed, Dec 20, 2023 at 4:14 PM Matthias Felleisen @.***> wrote:

@rfindler — I think a student deserves feedbacks about tests even if the program fails before any tests are run.

(What seems weird, is that the first two tests aren’t run. But we decided a long time ago that the program expressions run before tests run.)

— Reply to this email directly, view it on GitHub https://github.com/racket/htdp/pull/213#issuecomment-1865214071, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADBNMB7DRYP4OJIJZY6VLLYKNPJ3AVCNFSM6AAAAABAQBEQD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRVGIYTIMBXGE . You are receiving this because you were mentioned.Message ID: @.***>

mfelleisen commented 11 months ago

You and I agreed with Mike about this behavior in the conversation about the precursor of this bug.