Open jwoudenberg opened 7 years ago
This is a compelling argument to me.
@mgold thoughts?
Yeah, that sounds about right.
I'm not sure how much simplification this would actually enable without regressing the shrinking behavior of map
and mapN
.
Excellent! I'm working on an experiment in #162 (another potential avenue for simplification if it doesn't impact performance). I'll report back when I have some code to show.
Maybe this has been seen and considered already, but hedgehog is a haskell library that tried to address common difficulties with quickcheck generators: http://teh.id.au/posts/2017/04/23/property-testing-with-hedgehog/
Also, interesting comments from the QuickCheck author: https://www.reddit.com/r/haskell/comments/646k3d/ann_hedgehog_property_testing/dg1485c/
It seems like Hedgehog's main selling point is that it does what we already do 😄
In Hedgehog, a reasonably good shrink function is constructed for free alongside every generator. Shrink functions are coupled with every generator, and a lazy tree of all possible shrinks is constructed alongside each generated value. Shrinking is then little more than traversing the tree. This is Hedgehog's theoretical point of difference from QuickCheck, though it also differs in a few practical ways.
This part of Hedgehog is interesting:
While Range helps us construct generators in terms of bounds and steps, we may also wish to write generators in terms of negative invariants. It may be simpler to generate terms, apply some predicates to them, and discard the invalid ones. This gives us the flexibility to write much more complicated generators, though they may take longer to run, or diverge.
Generators that discard heavily are certain to be less productive than other constructive generators. They can generate an arbitrary number of invalid values for each valid one, at the expense of your CPU cycles and precious seconds you probably needed for your life. So, this is a tool to be used gently and infrequently.
Seems like they're essentially talking about Fuzz.filter
here.
Yes, that sounds correct.
Is the major point here that hedgehog is not monadic / does not support andThen
and therefore we have grounds to deprecate it?
I wonder what we can learn from its strategies for generating and shrinking, e.g. advise us on #168.
Would not a slightly simpler solution be to keep andThen
and simply make it not shrink? Seems like that would bring all the proposed benefits, but still keep the convenient API?
Shrinking is important. You can't get rid of shrinking in general, and it's hard to know what shrinks are good, in general. For example. Fuzz.frequency
should shrink by shrinking the chosen fuzzer, not picking a different one. But maybe someone else wants something else.
Yeah, I agree. I turned off shrinking once to address a slow-to-shrink failure back in the elm-check days and the output was so horrendous I ended up rewriting the test.
If we're shipping something that doesn't shrink, we're probably shipping a footgun that we'll later recommend against using. 😅
The one place we use Fuzz.andThen
in our tests is in a construct like this:
let
makeListOfBigRecords : Int -> Fuzzer (List BigRecord)
makeListOfBigRecords items =
0 ->
Fuzz.constant []
1 ->
Fuzz.map (\thing1 -> [ thing1 ]) (bigRecordFuzzer 0)
2 ->
Fuzz.map (\thing1 thing2 -> [ thing1, thing2 ]) (bigRecordFuzzer 0) (bigRecordFuzzer 1)
-- map3 and map4 respectively.
_ ->
Debug.crash "Too many things"
in
Fuzz.intRange 0 4
|> Fuzz.andThen makeListOfBigRecords
|> Fuzz.map Array.fromList
The tests processing those big records get really, really slow when the array is long. This only seemed to happen when Fuzz.list
generated really long lists. Our app logic only cares if the length is ≧ 3 (but can technically handle longer lists), so I solved the issue by feeding a length argument into a helper function.
In order to quit using Fuzz.andThen
while keeping our tests relatively fast, we need a way to specify a maximum size on lists. The argument to bigRecordFuzzer
can be handled with Fuzz.map
and List.indexedMap
.
That's a good usecase @nickmeharry, thanks for sharing. There's a discussion in #168 about list Fuzzers perhaps generating fewer elements. Also, it's possible to create a shortListFuzzer without relying on andThen
, although it's not as flexible in the maximum size of lists it generates as yours.
shortList : Fuzz.Fuzzer a -> Fuzz.Fuzzer (List a)
shortList itemFuzzer =
Fuzz.frequency
[ (1, Fuzz.constant [])
, (1, Fuzz.map (\a -> [ a ]) itemFuzzer)
, (1, Fuzz.map2 (\a b -> [ a, b ]) itemFuzzer itemFuzzer)
]
@BrianHicks showed me another use case for andThen
: creating a lazy fuzzer. See here. It might make sense adding a function like that directly to elm-test.
How would you all feel if I sent around a short survey asking for applications people have for andThen
, to take inventory of what it would take to make removing it acceptable for people?
Sounds good to me!
On Wed, Jun 14, 2017, 8:44 PM Jasper Woudenberg notifications@github.com wrote:
How would you all feel if I sent around a short survey asking for applications people have for andThen, to take inventory of what it would take to make removing it acceptable for people?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/elm-community/elm-test/issues/161#issuecomment-308620414, or mute the thread https://github.com/notifications/unsubscribe-auth/ABCxwIbotXnKpn6vBeOpyLPhhugZrIfIks5sEKi0gaJpZM4NkyHf .
My usecase for andThen
is for the "update" tests, when I want the Msg fuzzers not to be standalone but to depend on previous model (see branch fuzzers-dependent-on-model).
Model
andThen
apply it to Model -> Fuzzer Msg
function to get a Msg fuzzermap2
them together to get a new Model
The problem with this is the shrinking (when the fuzz tests pass, well they just pass. When they fail, the shrinking hogs the CPU and never ends).
What would be a proposed new API to do something similar? custom
? (ATM I don't see how that would work)
@Janiczek nice use case! Yes, the idea is that custom
provides the basic building block to implement tools for any specific use case. It takes a generator and a shrinker, of which the generator should be simple to implement because you can compose it in the same way you're currently composing your Fuzzer (generators support andThen
). Building a custom shrinker is harder and it sounds like that's exactly what you need, considering the default shrinker isn't working for you.
I made a survey! https://docs.google.com/forms/d/e/1FAIpQLSebuE0i4KXChLN4nstmH720UeQ3jgF-PRlCg-_eNj0Nv7i_Yg/viewform?usp=sf_link
I'll distribute it among Elm channels and report back the results.
@jwoudenberg for future reference, how long do you plan to let the survey run?
@rtfeldman I plan on closing it this weekend. I got four responses so far that didn't really unearth anything new. I'll write up a summary and a proposal for next steps later this weekend.
Sounds good!
On Sat, Jun 24, 2017, 11:27 AM Jasper Woudenberg notifications@github.com wrote:
@rtfeldman https://github.com/rtfeldman I plan on closing it this weekend. I got four responses so far that didn't really unearth anything new. I'll write up a summary and a proposal for next steps later this weekend.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/elm-community/elm-test/issues/161#issuecomment-310856474, or mute the thread https://github.com/notifications/unsubscribe-auth/ABCxwENUgiNeUpegK_ptlaFGZI97vLyRks5sHVTzgaJpZM4NkyHf .
Here's a summary of all the use cases that came up in the survey, which got 4 responses:
lazy : (() -> Fuzzer a) -> Fuzzer a
to fuzz recursive data structures.Fuzz.frequency
.Some thoughts on how we could address this:
Evan had a nice idea for an API that could address this in a neat way. I tried it out here. @Janiczek and @BrianHicks, would something like that be a good substitute for fuzzing Msg's for your use cases?
lazy
to fuzz recursive data structures.It should be pretty easy to implement lazy
directly in elm-test. Unfortunately lazy
has the same disadvantage as andThen
when it comes to #160: Figuring out whether a fuzzer is valid or not before running it. Does anyone have any nice ideas for working around that?
We have a workaround using mapN for this. A nicer solution is being considered here. Would that work for you @nickmeharry?
Fuzz.frequency
This sounds like something that could be achieved with mapN
too. I've reached out to this person to get some details.
The randomWalk
idea does not allow for the following case: using the model to create and Html msg
, and then using the Html msg
to produce a list of possible msg
that can be produced by the current UI, and then choosing one of those to perform next. Though maybe that sort of test is not appropriate for fuzz testing, since there may not be a sensible, general way to shrink that.
@jwoudenberg Wow, the randomWalk
idea seems really nice! From a read-through, I wonder about hardcoding Msgs into these tests (ie. Input "tom"
in the docs). Does that limit what bugs the fuzz test can find? (I'll probably try to port my examples to use this lib and see what happens :) )
@avh4 You mean grab the messages that can be generated from the Html fully automatically? I didn't know that was possible, cool!
I like the ease of use of that scenario, it would take very little work to get a test up and running. I also have doubts about what the quality of those tests would be in the general case. Consider a rendered page with two buttons, once of which logs the user out and the other of which proceeds the user to a page with lots of complexity. 50 % of fuzz runs would end up testing the trivial scenario of logging out and the problem compounds with every extra step the test makes.
So for longer series of Msg's, I think some guidance is needed to ensure test time is spent well, but I like the idea as a single shot test: I'm on this page, I click any button, then X should hold for my model.
@Janiczek Credit goes to Evan!
You raise an interesting point with regards to the hardcoded messages. I don't really have a clear opinion on in yet. First things that come to mind: We could look incorporating randomness into the API for the messages, although that would probably complicate the API. On the other hand, perhaps it's nice to gently nudge away from that much detail in what's a pretty high-level test. For testing specific complicated transitions, it's always possible fuzz a single function. Or, if you want to, to first set up a model at a specific state (optionally using randomWalk
to get there), then fuzz a single message in as much detail was you want and append that to the end of the message list returned by randomWalk
using map2
. Anyway, food for thought :).
@jwoudenberg Yes, the frequency
based method should work just as well for me. In hindsight, my approach was basically emulating frequency
, but not as well.
@jwoudenberg care to finish the bullet in your comment? :)
@Janiczek Haha, that entire list was a previous draft of the paragraph above. I deleted the bunch. Sorry for the mess :).
I've thought a bit more about the lazy case and don't really see a way we can have our cake and eat it too. It seems to me we have to choose:
lazy
at the cost that we cannot statistically analyse fuzzers. This means it cannot in general be determined whether a fuzzer is invalid (because you never know what that lazy is going to do, currently it's possible to cause a Debug.crash
this way). Analysing a fuzzer to find problems like double-nested lists and doing something about them will also become much harder.lazy
at the cost of not being able to easily express recursive data structures. To do something recursive-like a user will need to pass down a decreasing counter to ensure the recursion stops at some point. @BrianHicks I'd love to hear if this would be workable for your usecase.I'm leaning towards option two.
Limiting recursion actually sounds preferable to me. lazy
is just a way to get around the defined-in-terms-of-itself limitation. I'm doing odd things with frequency
to make sure my fuzzers don't get too deep; they run very slow and crash if I don't. A first-class solution to recursive data structures would be helpful!
Seconding Brian. It's too easy to write a fuzzer that makes an infinite tree. You have to do things like
if depth > 2 then
( 0, Fuzz.constant (Point ( 1, 2, 0 )) )
else
( 1 / depth, Fuzz.map GeometryCollection (Fuzz.list (helper (depth + 1))) )
(source)
What about an API like this? You pass in a function next : (a -> a) -> a
to recurse down a level and a value done : a
to as an end value. I say 'value', but these's a
's are all fuzzers in our case so it done
will actually generate multiple actual values, just not recursive ones
With these two functions we can generate recursive values for users, with a maximum depth we as library authors decide. We can add some Fuzz.frequency
in the implementation to add some randomness to the maximum depth.
I'm wondering it taking a solution like that into elm-test itself directly is a good idea though, it might be nice to see it used for a bit first. Same goes for the message list Fuzzer. What do you all think of making those snippets available as a cookbook for the time being and leaving it at that until they've matured a bit?
So, we still don't have a good story for "what to do instead of Fuzz.andThen
".
Fuzz.custom
is great if you're building all your fuzzers from scratch, but if a library wants to use user Fuzzers, there is still some part of API missing.
From the user perspective, it might be as simple as ??? : Fuzzer a -> (Generator a, Shrinker a)
. That should be trivial to write from elm-test
perspective, and give library authors the freedom needed.
I think that would require holding references to the original generator and shrinker. Perhaps not impossible, but a pain.
I think a Fuzzer a -> (Generator a, Shrinker a)
function would be an excellent addition to the API, but impossible to implement using the current shrinker representation. The problem is that there's no way, I think, of turning a RoseTree a
back into a Shrinker a
.
Consider why internally we're using trees instead of lists. A shrinker has the signature a -> LazyList a
. If we'd try to use that type as the internal representation of our shrinking behaviour then we'd run into trouble as soon as we'd try to implement map. After all, map would need to look something like this:
map : (a -> b) -> (a -> LazyList a) -> (b -> LazyList b)
If I understand my types correctly there cannot exist an implementation for that. We can map the LazyList a
to a LazyList b
, but that would give us a a -> LazyList b
which is not a Shrinker b
. And even if it were it would not be possible to use a -> LazyList b
to actually shrink values because you would not be able to call it recursively to create smaller and smaller values.
It would be great if we could do something like this in the future. Perhaps at some point Elm will allow us to take this approach or we can try to reimplement fuzzers on the basis used by Hypothesis.
FWIW, it's possible if you add a (b -> a)
argument.
dimap : (a -> b) -> (b -> a) -> (a -> LazyList a) -> (b -> LazyList b)
as done in the shrinker library
convert : (a -> b) -> (b -> a) -> Shrinker a -> Shrinker b
So, referencing this, that would be:
type Fuzzer a b
= Custom (Generator b) (Shrinker b)
| Dimap (b -> a) (a -> b) (Fuzzer a a)
| AndThen (a -> Fuzzer a b) (Fuzzer a a)
custom : Generator a -> Shrinker a -> Fuzzer a a
custom =
Custom
dimap : (b -> a) -> (a -> b) -> Fuzzer a a -> Fuzzer a b
dimap =
Dimap
andThen : (a -> Fuzzer a b) -> Fuzzer a a -> Fuzzer a b
andThen =
AndThen
toGenerator : Fuzzer a b -> Generator b
toGenerator fuzzer =
case fuzzer of
Custom generator _ ->
generator
Dimap map2Fn mapFn prev ->
toGenerator prev
|> Random.map mapFn
AndThen andThenFn prev ->
toGenerator prev
|> Random.andThen (\val -> andThenFn val |> toGenerator)
shrunkenGenerator : Fuzzer a b -> Generator (LazyList a)
shrunkenGenerator x =
-- Non trivial implementation which would still be easier than what we have now.
Debug.crash "TODO"
(this typechecks... whether it makes sense is another thing :sweat_smile: )
Hi, I've seen that the removal of andThen is already done in a PR, but I'd like to add one use case.
I'm testing a linear algebra package, and doing so, I need to verify invariants on operations, for example trace ( A + B ) == trace A + trace B
. More invariants example listed in this issue.
To do so, I need to generate random matrices of a given compatible size for the operands. Currently I'm using andThen
like so:
sameSizePair : Fuzzer ( Matrix, Matrix )
sameSizePair =
Fuzz.tuple ( size, size )
|> Fuzz.andThen (\s -> Fuzz.tuple ( matrixSized s, matrixSized s ))
matrixSized : ( Int, Int ) -> Fuzzer Matrix
matrixSized ( height, width ) =
Fuzz.oneOf
[ rawMatrixSized ( height, width )
, transposedMatrixSized ( height, width )
, stridesMatrixSized ( height, width )
]
rawMatrixSized : ( Int, Int ) -> Fuzzer Matrix
rawMatrixSized ( height, width ) =
dataFuzzer ( height, width )
|> Fuzz.map (T.unsafeFromTypedArray 2 [ height, width ])
I'm not sure how I'd do this without the andThen
. I see there is a custom
fuzzer but going that road would complicate a lot of things for me since I'm really not familiar with shrinkers and all.
More code available here if needed.
andThen
is all about using randomness to drive more randomness. mapN
is about transforming randomness in a non-random way. It sounds like the dimensions of the matrix don't affect the variation you chose for the matrix. So what about:
type MatrixType = Raw | Transposed | Strides
matrixType : Fuzzer MatrixType
matrixType = oneOf [Raw, Transposed, Strides]
matrix : Fuzzer Matrix
matrix =
Fuzz.tuple4 (size, size, matrixType, matrixType)
|> Fuzz.map (\(width, height, type1, type2) -> ...)
You'll need to convert values of MatrixType
into functions that make matrices. If you can randomly pick from a list of those functions and avoid the union type, do that.
andThen is all about using randomness to drive more randomness
Yes that's exactly what I need. Or maybe I'm misundersanding something else. The matrices are to be generated of a random compatible size, and their content is then also randomly generated (dataFuzzer
in previous example, code below).
I cannot generate a random matrix only from it's dimensions and it's type, there needs to be a side effect somewhere. Well actually there is a way if I also generate a seed and use the Random
module, but it seems to circumvent alltogether the usefulness of fuzzers.
Let's imagine that List.map2
would only work if the two list are of the same size. How could you test the invariant for example that for any two lists of the same size, the result is of the same size. My problem is very much the same. Is it possible to check that without andThen
or creating a specific random generator of two lists of a same given size?
PS: I went the Random
way for generating the data of a matrix of a given size already as follows. I hoped I didn't have to do that from scratch for each type of fuzzer I need like sameSizePair
metionned initially.
dataFuzzer : ( Int, Int ) -> Fuzzer (JsTypedArray Float64 Float)
dataFuzzer ( height, width ) =
jsFloat64Array (height * width)
jsFloat64Array : Int -> Fuzzer (JsTypedArray Float64 Float)
jsFloat64Array arrayLength =
randomJsTypedArray JsFloat64Array.fromList (Random.float 0 1) arrayLength
randomJsTypedArray : (List b -> JsTypedArray a b) -> Random.Generator b -> Int -> Fuzzer (JsTypedArray a b)
randomJsTypedArray arrayFromList generator arrayLength =
Fuzz.map (generateRandomArray arrayFromList generator arrayLength) Fuzz.int
generateRandomArray : (List b -> JsTypedArray a b) -> Random.Generator b -> Int -> Int -> JsTypedArray a b
generateRandomArray arrayFromList generator arrayLength intSeed =
Random.initialSeed intSeed
|> Random.step (Random.list arrayLength generator)
|> Tuple.first
|> arrayFromList
@mgold I think the fact that I had the intermediate oneOf
in the example is misguiding sorry about that. You could consider the same example but only with rawMatrixSized
like below. dataFuzzer
code is in previous post. Basically it generates an array of random floats, of a given size.
sameSizePair : Fuzzer ( Matrix, Matrix )
sameSizePair =
Fuzz.tuple ( size, size )
|> Fuzz.andThen (\s -> Fuzz.tuple ( rawMatrixSized s, rawMatrixSized s ))
rawMatrixSized : ( Int, Int ) -> Fuzzer Matrix
rawMatrixSized ( height, width ) =
dataFuzzer ( height, width )
|> Fuzz.map (T.unsafeFromTypedArray 2 [ height, width ])
Hey @mpizenberg, can you share your experience using Fuzz.andThen
so far? Based on what your describing it does sound like Fuzz.andThen
fits your use case very well, but I'd also expect you to run into weird crashes and hanging tests every now and then, especially when test cases are failing. Have you seen any of that?
If so, then the reason for that might be that the amount of shrinking possibilities for a random matrix of numbers based on a random matrix with and height is immense, and so a failing test takes a long time to work through them all.
One approach that doesn't require andThen
that might work for you is to create two random matrices, then use map2
to take the matrices with potentially different sizes, and return two matrices of the same size. For instance, you could create a copy of the array with the smallest amount of elements and fill it with the elements of the larger array.
Hey @mpizenberg, can you share your experience using Fuzz.andThen so far?
@jwoudenberg Yes sure! Let's first say that this project is very much a WIP and I'm not spending a lot of time on this one, so take into account that it's not a priority to me. If I have to do another way, I will, it's just that it was quite convenient ;) Like Evan said, if you provide functionalities to users they will use it ^^.
So, while I'm developping, I only test with very low sizes any way. For now I have size = Fuzz.intRange 0 16
meaning I never get matrices with more than 16x16 elements. My steps are usually:
Debug.crash
holes inside.Debug.crash
Debug.crash
-> I usually get back multiple tests iterations that failed with a Given someValue ...
where someValue
is not always very usefull (this is probably due to hard shrinking I guess) but sufficient for me to understand if I did something obviously wrong.That's basically the project flow.
Up until now I have fortunately not experienced crashes or very slow tests. Probably because I keep size limit low (16) while experimenting.
If for the sake of the performances of the package, or because it can cause issues with shrinking, it's better to remove andThen
entirely, I'm fine with that. In my use case, since I have control over maximum size, and variability of matrices, I think I do not need shrinking that much, and would also be fine with and andThen
version that would not shrink.
One approach that doesn't require andThen that might work for you is to create two random matrices, then use map2 to take the matrices with potentially different sizes, and return two matrices of the same size.
Yes that was the first thing I was doing at the beginning except troncating the bigger matrix instead of completing the smaller one. I eventually stopped doing so for two reasons:
let ... in
code at the beginning of each test was defeating the purpose of having clear tests. This was more problematic for readibility.Point 2. was annoying because I was doing that for all tests that needed specific input. I did not thought of doing this for creation of a fuzzer though. It feels a bit hacky but you're right, it should work I believe.
I see your problem better now, thank you for explaining.
The knowledge that I was "loosing" time by generating more random values than necessary
I'd be more concerned with shrinking inefficiencies. Shrinking a number that is discarded by the test is useless.
A point made waaay up in this thread is that advanced users should use Fuzz.custom
. This means supplying your own Random.Generator a
, which looks a lot like building up fuzzers and supports andThen
without issue, and your own Shrinker a
, which is where you supply logic about how to make generated values "smaller". The idea is that, for complicated data structures, you will know better than a general algorithm what shrinks are likely to be valid or interesting. For example, shrinking a number will always tend towards zero, but shrinking a matrix should perhaps tend towards the identity.
However, it sounds like there's a more specific function that may do what you need: listOfLength (Fuzz.int 1 16 |> Fuzz.map (\x -> x*x*2)) matrixElementFuzzer
. This was proposed in #216. I suspect that it would be easier and more reliable to shrink a list of variable length than shrink an arbitrary andThen
function. I'm not thrilled with the idea, but it would work well for your use case.
A point made waaay up in this thread is that advanced users should use Fuzz.custom.
You're probably right on this one. I think the issue was that, since I could create the fuzzers I needed with andThen
I considered my use case as regular. I see now that it is a bit more advanced. In any case, shrinking should be driven by the data structure and domain specifics. And data structures like matrices are specific enough to require a dedicated shrinking strategy I guess.
I think, if there was not the andThen
function, and the documentation would say something like: "if you cannot build the fuzzer you need with provided functions, there is a high chance that your data is too specific, and automatic shrinking strategies will not work for you. In this case, we recommend you use Fuzz.custom
...", I would probably have accepted straight away to spend a little time to do it this way.
However, it sounds like there's a more specific function that may do what you need: listOfLength ...
A function like
listOfLength : Fuzzer Int -> Fuzzer a -> Fuzzer (List a)
would be limiting in my case, since I could not "factorize" a list length into any width and height. It would have to be a square, which is quite limiting for testing non square matrices.
Thanks alot for the feedback. I think I'm gonna take the custom
way and will probably get back to you at some time if I have an issue doing so.
Although I really like the
andThen
conceptually, in practice I run into trouble when using it more often than not (#105, ~#132~, #160). I think it's very easy to come at it with the expectation that it works the same way asandThen
for generators, but find that using it actually requires a fair bit of knowledge about the shrinking behaviour. This seems a bit treacherous for users of the library, most of whom might hardly be aware of shrinking or consider it an implementation detail.Most cases where I used
andThen
could be refactored not to rely on it. For those instances whereandThen
is really needed, I'd argue usingcustom
might be the better choice because it forces the user to think about the shrinking behaviour they want. It's very hard to guess the correct let alone efficient shrinking behaviour for the general case. Consider the fuzzer in the (artificial) example below doing way too much work:Lastly, a few possible feature and performance improvements of fuzzers in general currently seem to be quasi-blocked on the complexity of the fuzzer codebase (#109, #148).
andThen
seems to be by far the most complex part of theFuzzer
codebase at the moment. Removing it might allow us to simplify a bit, empowering us to tackle some of those ideas.