racket / htdp

Other
93 stars 70 forks source link

test coverage false negative in #lang htdp/bsl #200

Open bremner opened 1 year ago

bremner commented 1 year ago

In the following example pred is shown as uncovered (tested in DrRacket 8.8 and 8.10).

#lang htdp/bsl

(define-struct S (pred))

(check-expect (S-pred (make-S #t)) #t)

If I manually select the language (and delete #lang), DrRacket indicates full coverage. Interestingly whatever emacs racket-mode is doing, it also indicates full coverage for the #lang version

mfelleisen commented 1 year ago

Sadly this is a problem with the general coverage checker, which doesn't specialize to the specified #lang. I don't think the *SL maintainers can help here.

rfindler commented 1 year ago

I think the macros that make up the definition of bsl are a good place to start looking. That is, the way the test coverage works is to expand the program and record the source location of each subexpression. When one is evaluated, that source location is marked as covered. Then, when the program is done, all unmarked source locations are lit up.

In this example, it looks like maybe what's going on is the source location for the S is only on things that are behind lambdas. That is, the code below, expands a slightly simpler program and reports where it finds things with struct name's source location. They all seem to be behind lambda, except one, which might be at compile time, possibly?

[ EDIT: the code used to be searching for the s in (define-struct s (x)) but it should have been searching for x. That's now fixed ]

#lang racket

(define p
  (open-input-string "#lang htdp/bsl\n(define-struct s (x))"))
(port-count-lines! p)
(define s-pos 34)

(define stx
  (parameterize ([current-namespace (make-base-namespace)])
    (expand
     (parameterize ([read-accept-reader #t])
       (read-syntax
        "tmp.rkt"
        p)))))

(let loop ([stx stx]
           [inside '()])
  (cond
    [(syntax? stx)
     (when (equal? (syntax-position stx) s-pos)
       (printf "found ~s inside:\n" stx)
       (for ([i (in-list inside)])
         (printf "  ~s\n" (syntax->datum i)))
       (printf "\n"))
     (define l (syntax->list stx))
     (cond
       [(pair? l)
        (for ([x (in-list l)])
          (loop x (cons (car l) inside)))]
       [else
        (loop (syntax-e stx) inside)])]
    [(pair? stx) (loop (car stx) inside) (loop (cdr stx) inside)]))
rfindler commented 1 year ago

My thinking is that, assuming my guess above is correct, it should be straightforward for someone to change the expansion of define-struct so that it produces a #'(void) whose source location has been changed to be the same as the name of the struct and hopefully that'll fix the problem.

samth commented 1 year ago

There is in fact code that explicitly does this, with a note about how it's for test case coverage.

See https://github.com/racket/htdp/blob/master/htdp-lib/lang/private/teach.rkt#L1062-L1067

samth commented 1 year ago

For reasons that I don't understand, moving the (void)s around fixes the problem. #201 does that.

samth commented 1 year ago

I do think this indicates a problem with the coverage tool, though. I haven't managed to reproduce it with a simpler program, unfortunately.

rfindler commented 1 year ago

So maybe we should figure out to make a unit test for test coverage?

Robby

On Mon, Sep 18, 2023 at 4:13 PM Sam Tobin-Hochstadt < @.***> wrote:

For reasons that I don't understand, moving the (void)s around fixes the problem. #201 https://github.com/racket/htdp/pull/201 does that.

— Reply to this email directly, view it on GitHub https://github.com/racket/htdp/issues/200#issuecomment-1724437172, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADBNMES5RT3TBSNPDPOIKTX3C2QPANCNFSM6AAAAAA4436GB4 . You are receiving this because you commented.Message ID: @.***>

mfelleisen commented 1 year ago

Argh. I added these lines based on a prompt from John way back and Mike put them back in place.

(Robby and I had a brief side-conversation. We can fix this issue here but it really is a problem of the coverage tool. Similar -- but not the same -- false negatives show up in #lang racket.)

mfelleisen commented 1 year ago

@rfindler Should we merge Sam's PR for now?

rfindler commented 1 year ago

@rfindler Should we merge Sam's PR for now?

I'm not sure. When I fixed my little script above, I do see a (void) with the right srcloc that should have been evaluated. I don't mind merging the PR, but it might make sense to play around a little more to understand why it wasn't working.

rfindler commented 1 year ago

There is definitely something fishy going on here. When I change line 1066 from

(syntax/loc field (void))

to

(quasisyntax/loc field (printf "~s\n" '#,field)))

I do see the printf so I'm thinking that the fix probably does belong elsewhere.

rfindler commented 1 year ago

It looks like this worked in 7.9 (blog post on nov 2, 202) but not in 8.0 (blog post on feb 13, 2021).

I'm not sure the history is helpful, tho, since a lot of things changed. It looks like, between 7.9 and 8.0, there were some changes to signatures that led to this commit which is what introduced the (syntax/loc field (void))) as an effort to unbreak coverage. It seems reasonable to assume things were working at the moment of that commit, but I cannot verify it.

I don't see any commits in the errortrace repo from the time period in question that seem relevant.

rfindler commented 1 year ago

I've created a gist that distills down what DrRacket is doing when it invokes errortrace's test coverage and it reports the same wrong results as we see in the htdp program.

https://gist.github.com/rfindler/12eedd4aef298e84d56f0f70784c36f8

samth commented 1 year ago

What is DrRacket doing differently for the "Beginning Student" test coverage vs the "#lang htdp/bsl" test coverage?

samth commented 1 year ago

Here's one weird thing: if I reverse sorted on line 182: https://gist.github.com/rfindler/12eedd4aef298e84d56f0f70784c36f8#file-drr-test-coverage-rkt-L182 everything is covered (which is not what I expected sorting to do).

rfindler commented 1 year ago

What is DrRacket doing differently for the "Beginning Student" test coverage vs the "#lang htdp/bsl" test coverage?

It looks like the difference is in the syntax property 'errortrace:annotate. That property is not on the code that's generated for the coverage information for the struct selector name and the teaching languages tools just ignore that, while the #lang-based test coverage pays attention to it.

That is, changing this call to should-annotate? to just #t produces the desired results. This is the definition of should-annotate?.

I'm not sure why the property isn't being added; if this is the function that's adding the property, it sure seems like it should be showing up.

rfindler commented 1 year ago

Ah. This is probably the right fix:

$  git diff
diff --git a/htdp-lib/lang/private/teach.rkt b/htdp-lib/lang/private/teach.rkt
index c505bde1..4b15ebcc 100644
--- a/htdp-lib/lang/private/teach.rkt
+++ b/htdp-lib/lang/private/teach.rkt
@@ -1063,7 +1063,7 @@
                                   ;; one with the location of one field.  This marks the fields in
                                   ;; define-struct as covered.
                                   #,@(map (lambda (field)
-                                            (syntax/loc field (void)))
+                                            (datum->syntax #'here '(void) field field))
                                           (syntax->list #'(field_ ...)))
                                   proc-name ...)))))
                   ;; --- IN ---
rfindler commented 1 year ago

Here's one weird thing: if I reverse sorted on line 182: https://gist.github.com/rfindler/12eedd4aef298e84d56f0f70784c36f8#file-drr-test-coverage-rkt-L182 everything is covered (which is not what I expected sorting to do).

I think that's expected. Sometimes a region is marked as covered because it is, say, a function definition. That'll give us a large region. Then, when the function isn't called, there will be smaller regions, inside that region, that will be marked as uncovered. We want those smaller regions get priority, which is what the sorting is doing here.

samth commented 1 year ago

I still think moving the extra voids out of that location is the right thing to do, combined with your use of datum->syntax to copy location/properties.

rfindler commented 1 year ago

@samth do you want to do the honors? It sounds like no objects to that fix.

samth commented 1 year ago

I've pushed that revised change to #201, then I'll look into the test failure there.