NorfairKing / sydtest

A modern testing framework for Haskell with good defaults and advanced testing features.
115 stars 25 forks source link

Hedgehog report progress #77

Open 1Jajen1 opened 5 months ago

1Jajen1 commented 5 months ago

I switched a few hedgehog property tests from tasty to sydtest. Those are for cesu8 and modified utf8 validation. When changing validation, encoding or decoding, I run the property tests with a lot more examples, sometimes for an hour or two in the background.

Hedgehog and tasty print out progress updates, but sydtest is completely silent, making it harder to gauge how much longer the tests will run.

It's not too bad with --fail-fast but frequent progress reports would be nice to have, especially because hedgehog supports them directly.

1Jajen1 commented 5 months ago

Also is --fail-fast supposed to end the test on failure? It reports the failed tests, but doesn't kill the other running ones or so it seems.

NorfairKing commented 5 months ago

@1Jajen1 Please show me the code you're working on. sydtest doesn't output by default (as much as that's possible) because tests are run in parallel and outputting is not threadsafe. I haven't used Hedgehog much because I recommend against it, so I'm not sure what the current situation is. Make sure you try with the --debug if you "just" want to see progress.

I'd be happy with fixes/improvements of course.

1Jajen1 commented 5 months ago

Please show me the code you're working on.

Sure: https://github.com/1Jajen1/Brokkr/blob/main/brokkr-cesu8/test/ModifiedUtf8Spec.hs. The old tasty code is in the brokkr-nbt package in an older commit.

sydtest doesn't output by default (as much as that's possible) because tests are run in parallel and outputting is not threadsafe.

Tasty can somewhat output in a parallel setting, it seems to output top down and update progress for the current top test under evaluation. Hedgehog also has a way to report progress for parallel tests and not just the limited thing tasty has, it has full updates for all properties being tested without spamming the console. It wraps concurrent-output to achieve this.

For reference this is what hedgehog outputs and updates during the test (checkParallel inside a sydtest spec, so things are a bit nested) Screenshot_20240516_223154

I haven't used Hedgehog much because I recommend against it

Why's that? I find hedgehog much nicer to use than quickcheck. Especially their error reporting is amazing.

Make sure you try with the --debug if you "just" want to see progress.

Its a bit verbose (especially because I am doing a few hundred thousand iterations), but it definitely helps.

NorfairKing commented 5 months ago

@1Jajen1 It's pretty important to me that sydtest doesn't delete any of the output it's already sent. Those kind of fancy things tend to leave the terminal in a state where you need to reset it. I'd be open to improvements to the way sydtest produces output though!

I haven't used Hedgehog much because I recommend against it

Being required to write all generators yourself throws the baby out with the bathwater. You can generate generators, see genvalidity. I'd be happy to chat if your company would like an intro to validity-based testing. There are also talks online and all my products use this approach.

1Jajen1 commented 5 months ago

It's pretty important to me that sydtest doesn't delete any of the output it's already sent. Those kind of fancy things tend to leave the terminal in a state where you need to reset it. I'd be open to improvements to the way sydtest produces output though!

I'd definitely keep this optional and off by default. This is only useful in some specific circumstances or when you just want pretty output. Visual feedback for progress is nice, maybe not necessary, but nice. I also don't know how big of a change this would be. I'd assume you'd assign an output region per runner and output through that. Could also show the currently running tests then.

Being required to write all generators yourself throws the baby out with the bathwater. You can generate generators, see genvalidity.

I mean technically nothing stops you from putting this system over hedgehog generators right? I don't mind the generators, but I'd agree that its a weak point, shrinking is also sometimes sub optimal, but that has different reasons. I haven't spent much time thinking about this, but first thoughts are: hedgehog feels more complete, the output is amazing and it has lots of useful features such as state machine testing for example.

I'd be happy to chat if your company would like an intro to validity-based testing. There are also talks online and all my products use this approach.

I wish this was an option (haskell in the first place, but proper automatic testing too) Brokkr is hobby work. Game server for an existing client/game seemed like a fun challenge that touches a ton of different areas to learn about. So far it spawned an actual fast hashtable library, bit packed vectors and lots of minecraft related bits. A lot of it has been made with a focus on performance, but in more recent work on it I tried to take testing more seriously, mostly because I realized I'm bad at it.

NorfairKing commented 5 months ago

I mean technically nothing stops you from putting this system over hedgehog generators right?

That's not right. The shrinkers would become unusably slow (I think).

1Jajen1 commented 5 months ago

That's not right. The shrinkers would become unusably slow (I think).

You can just throw away the autogenerated shrink tree and add your own. Its been a few years since I've looked at the internals, but hedgehogs generators and shrinking should be compatible one to one with quickchecks generators. You should be able to create hedgehog generators from arbitrary instances without issues, but the other way around looses shrinking