brownplt / pyret-lang

The Pyret language.
Other
1.07k stars 111 forks source link

Generated values in check blocks #1633

Open dbp opened 2 years ago

dbp commented 2 years ago

I'm trying to write down an example of randomized testing (e.g., fuzzing / quickcheck tests). Doing that works fine, but when tests fail, I haven't figured out a good way of actually seeing what data they failed on.

i.e., if I do it as follows:

check:
  input = generate-input()
  my-fun(input) does-not-raise
end

(for a fuzzer)

and it finds a bug, I get a backtrace, but the backtrace doesn't tell me (as far as I can tell) what the actual values passed were.

I'm not sure of a good solution, though maybe someone has an idea?

I tried adding a spy: input end statement, but unfortunately the output from spy isn't scoped to the test blocks, so all the input values are shown at the top of the interactions, and then the results of the tests, which is not very usable. And since I don't think there is a way for me to catch the errors other than using check assertions, I don't think I can build a check combinator that selectively spys (which would work better, though if there are many failures it still wouldn't be great) -- for tests that don't error, selectively spying would improve things (eta expand the predicate in satisfies and spy in the false branch).

jpolitz commented 2 years ago

Good question. We should enhance the satisfies output to show the LHS when the RHS errors. Then this pattern would work:

fun my-fun(x): x + 1 end

check:
  fun helper(input) block:
    my-fun(input)
    true
  end
  inputs = [list: 1, "a"]
  for each(input from inputs):
    input satisfies helper
  end
end

Right now it just reports the RHS error, but the value is there for reporting. (Compare to just putting false at the end of helper and not calling my-fun at all if you want to see the value-based output).

dbp commented 2 years ago

I started implementing this, but realized while doing it that the (perhaps intentionally undocumented?) run-task that the test runners use can be used to do what I want:

include either
... 
fun error-free(f):
  lam(inp):
    cases(Either) run-task(lam(): f(inp) end):
      | right(_) => false
      | left(_) => true
    end
  end
end

check:
  inp = generate-inp()
  inp satisfies error-free(my-fun)
end

This produces a nice error, as it shows what the left side value is for which the predicate failed.

(It does seem like this pattern should be supported, without using what I guess are internal libraries).

dbp commented 2 years ago

Oh, I spoke too soon -- while that gives the input, it doesn't give the backtrace since it eats the exception, so not actually a perfect solution (it is, at least, better, as I can then run it myself in the interactions window).

jpolitz commented 2 years ago

In your use case are you interested in things that would be raised with raise, or in things that are genuine errors, like field-not-found?

dbp commented 2 years ago

Probably both? The use case is fuzzing, so any error is interesting… (as it indicates a bug).

On Tue, Dec 7, 2021, at 6:34 PM, Joe Politz wrote:

In your use case are you interested in things that would be raised with raise, or in things that are genuine errors, like field-not-found?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brownplt/pyret-lang/issues/1633#issuecomment-988341235, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAELBJPPETHC5YPJUV23GMDUP2KW3ANCNFSM5JPNWR7Q. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.