leanovate / gopter

GOlang Property TestER
MIT License
598 stars 40 forks source link

Recursive generator #50

Closed joneshf closed 5 years ago

joneshf commented 5 years ago

Thanks for this package, it's great!

I'm working on something and running into a bit of trouble with a recursive data type. How might you use this package for recursive data? For instance in the ScalaCheck documentation, there's an example of binary integer trees. How could you emulate that example with this package?

I'd assume you start like:

type tree interface {
    isTree()
}

type node struct {
    left  tree
    right tree
    v     int
}

func (*node) isTree() {}

type leaf struct{}

func (*leaf) isTree() {}

var genLeaf gopter.Gen = gen.Const(&leaf{})

var genNode gopter.Gen = gen.Struct(reflect.TypeOf(&node{}), map[string]gopter.Gen{
    "left":  genTree,
    "right": genTree,
    "v":     gen.Int(),
})

var genTree gopter.Gen = gen.OneGenOf(genLeaf, genNode)

But that ends up with an initialization loop. Converting genNode and genTree to functions allows it to be initialized:

func genNode() gopter.Gen {
    return gen.Struct(reflect.TypeOf(&node{}), map[string]gopter.Gen{
        "left":  genTree(),
        "right": genTree(),
        "v":     gen.Int(),
    })
}

func genTree() gopter.Gen { return gen.OneGenOf(genLeaf, genNode()) }

But, using either one ends up with a stack overflow.

Any thoughts on how to implement that binary int tree example?

untoldwind commented 5 years ago

Recursive generators have not been on the agenda yet, so there is no out of the box solution (yet?).

Of course the problem with a recursive structure is that the generator obviously has no concept of depth, i.e it does not know when to stop and hence possibly run into a stack overflow. This might be mitigated by using gen.Weighted instead of gen.OneOfGen (i.e. increase the probability of a Leaf). This does not address the core of the problem though.

One possible solution might be to do create a command-based test (https://godoc.org/github.com/leanovate/gopter/commands). I.e. instead of creating the tree out of the blue you generate a list of commands to fill it (like an RPN calculator).

Of course another solution would be to implement the generator directly. After all a generator is just a func(*GenParameters) *GenResult ... though that might be somewhat ugly.

Another solution that crosses my mind is to just add a depth parameter to your genNode and genTree functions. I.e.

func genNode(depth int) gopter.Gen {
    return gen.Struct(reflect.TypeOf(&node{}), map[string]gopter.Gen{
        "left":  genTree(depth - 1),
        "right": genTree(depth - 1),
        "v":     gen.Int(),
    })
}

func genTree(depth int) gopter.Gen { 
  if depth <= 0 {
    return gen.Leaf
  }
  return gen.OneGenOf(genLeaf, genNode(depth)) 
}

This feels kind of wrong and I have not tested it ... but it should work.

joneshf commented 5 years ago

Makes sense. Thanks for the ideas!

Checking the depth is the recommendation in the QuickCheck manual and sized exists. I've not used ScalaCheck, and I'm still pretty new to go, so I wanted to check if this had been already thought through.

Assuming this works out, would you be interested in a PR to add it to the package proper?

untoldwind commented 5 years ago

After some experiments I ended up with this example:

package main

import (
    "fmt"
    "reflect"

    "github.com/leanovate/gopter"
    "github.com/leanovate/gopter/gen"
    "github.com/leanovate/gopter/prop"
)

type tree interface {
    isTree()
    dump() string
}

type node struct {
    Left  tree
    Right tree
    V     int
}

func (*node) isTree() {}

func (n *node) dump() string {
    return fmt.Sprintf("V:%d (L:%s R:%s)", n.V, n.Left.dump(), n.Right.dump())
}

type leaf struct{}

func (*leaf) isTree() {}

func (*leaf) dump() string {
    return "Leaf"
}

var genLeaf gopter.Gen = gen.Const(&leaf{}).Map(func(t *leaf) tree { // cast to tree
    return t
})

func genNode(depth int) gopter.Gen {
    return gen.StructPtr(reflect.TypeOf(&node{}), map[string]gopter.Gen{
        "Left":  genTree(depth - 1),
        "Right": genTree(depth - 1),
        "V":     gen.Int(),
    }).Map(func(n *node) tree { // cast to tree
        return n
    })
}

func genTree(depth int) gopter.Gen {
    if depth <= 0 {
        return genLeaf
    }
    return gen.OneGenOf(genLeaf, genNode(depth))
}

func main() {
    properties := gopter.NewProperties(nil)

    properties.Property("Just dump", prop.ForAll(
        func(tree tree) bool {
            fmt.Println(tree.dump())
            return true
        },
        genTree(10),
    ))

    properties.Run(gopter.ConsoleReporter(false))
}

(unluckily the Map with explict casts to tree are necessary for reflection to work correctly. Not sure if its possible to "fix" that) This is most likely not the best solution, but seems to work.

untoldwind commented 5 years ago

As a matter of completeness: There actually is a way around the initial loop problem.

Like this:

package main

import (
    "fmt"
    "reflect"

    "github.com/leanovate/gopter"
    "github.com/leanovate/gopter/gen"
    "github.com/leanovate/gopter/prop"
)

type tree interface {
    isTree()
    dump() string
}

type node struct {
    Left  tree
    Right tree
    V     int
}

func (*node) isTree() {}

func (n *node) dump() string {
    return fmt.Sprintf("V:%d (L:%s R:%s)", n.V, n.Left.dump(), n.Right.dump())
}

type leaf struct{}

func (*leaf) isTree() {}

func (*leaf) dump() string {
    return "Leaf"
}

var genLeaf gopter.Gen = gen.Const(&leaf{}).Map(func(t *leaf) tree { // cast to tree
    return t
})

var genNodePtr gopter.Gen

func genNode(params *gopter.GenParameters) *gopter.GenResult {
    if genNodePtr == nil {
        return gopter.NewEmptyResult(reflect.TypeOf((*tree)(nil)).Elem())
    }
    return genNodePtr(params)
}

var genTree gopter.Gen = gen.OneGenOf(genLeaf, genNode)

func main() {
    genNodePtr = gen.Struct(reflect.TypeOf(node{}), map[string]gopter.Gen{
        "Left":  genTree,
        "Right": genTree,
        "V":     gen.Int(),
    }).Map(func(n node) tree { // cast to tree
        return &n
    })

    properties := gopter.NewProperties(nil)

    properties.Property("Just dump", prop.ForAll(
        func(tree tree) bool {
            fmt.Println(tree.dump())
            return true
        },
        genTree,
    ))

    properties.Run(gopter.ConsoleReporter(false))
}

Of course this is quite ugly and - as explained before - there is no control over the depth of the generated tree (i.e. one might end up with a stackoverflow depending on luck). Nevertheless it might be helpful to add the little trick with the genNode function to the API.

And of course pull-request / fresh ideas are always welcome

joneshf commented 5 years ago

Nice!

I think with your first example, if we have something like:

func sized(f func(int) gopter.Gen) gopter.Gen {
    return func(params *gopter.GenParameters) *gopter.GenResult {
        return f(params.Rng.Intn(100))(params)
    }
}

then I'd be all set. You could then say sized(genTree) instead of choosing the depth.

Does that make sense to do it that way? I tried with params.Rng.Int(), but it makes for some giant trees. I'm not sure about using MinSize or MaxSize.

unluckily the Map with explict casts to tree are necessary for reflection to work correctly. Not sure if its possible to "fix" that

This was a problem I was having for a long time and didn't know how to deal with. Thanks so much for showing that.

untoldwind commented 5 years ago

I would have said that this could be done with FlatMap but then realized that it is really cumbersome to use in its current form. This probably requires some experiment, but ideally one would just write gen.IntRange(20,100).FlatMap(genTree).

Until then though ... The idea behind MinSize, MaxSize is exactly this: To set limits to everything that has a size/length (like slices, strings, ...). So how about this:

func Sized(f func(int) gopter.Gen) gopter.Gen {
    return func(params *gopter.GenParameters) *gopter.GenResult {
        size := params.Rng.Intn(params.MaxSize-params.MinSize) + params.MinSize
        return f(size)(params)
    }
}

Edit: There is a branch for that now

joneshf commented 5 years ago

Awesome, thanks!

I run into an issue with that function though. params.MaxSize and params.MinSize both start at 0, so params.Rng.Intn(0) panics. Probably would need to bound it below by 1.

It would also be useful to document that if it's used recursively, this will create some huge data unless it's culled hard each time it's called (I didn't actually test my example from before :blush:). For instance, with a structure like tree above, it slows to a crawl for me when the MaxSize reaches 10 and consumes memory until I kill it. Maybe take your example above but divide by 10 instead of subtracting by 1?

untoldwind commented 5 years ago

Good find concerning the MaxSize = MinSize, just forgot about that. Should be fixed now.

And yes, with the defaults genTree might try to generate a tree of depth of 100, which is ... uh ... huge ;) I added a gentle reminder to the comment.

I'll probably add a better test and then merge it back to master.

untoldwind commented 5 years ago

Merge the change to master. Whille close this issue for now

Porges commented 3 years ago

Another workaround is to use a helper like:

func delayed(p *gopter.Gen) gopter.Gen {
    return func(gp *gopter.GenParameters) *gopter.GenResult {
        return (*p)(gp)
    }
}

Then you can break the cycle like:

type Tree struct {
    left  *Tree
    value int
    right *Tree
}

var genTree gopter.Gen

func init() {
    var nilTree *Tree
    genTree = gen.OneGenOf(
        gen.Const(nilTree),
        gopter.DeriveGen(
            func(left *Tree, value int, right *Tree) *Tree {
                return &Tree{left, value, right}
            },
            func(t *Tree) (*Tree, int, *Tree) {
                return t.left, t.value, t.right
            },
            delayed(&genTree),
            gen.Int(),
            delayed(&genTree)))
}
Porges commented 3 years ago

(Note that this doesn’t work with .Map as it immediately invokes the generator!)