stedolan / crowbar

Property fuzzing for OCaml
MIT License
183 stars 28 forks source link

try harder to avoid blowing the stack #16

Closed yomimono closed 6 years ago

yomimono commented 6 years ago

this alone helped for set/map testing but I think it's not enough for stuff as potentially big and self-tangled as the parsetree types. Intuitions about how better to set this particular parameter welcome :)

stedolan commented 6 years ago

facepalm

stedolan commented 6 years ago

I want to think about this problem a bit better in future, but this patch fixes the bug.

Getting a good distribution of tree depths in a quickcheck-like system is hard. The basic problem is that easy strategies (e.g. pick a leaf 25% of the time, a branch 75% of the time) actually don't terminate: even though the probability of a path being long goes down exponentially as the tree gets deeper, the number of such paths goes up exponentially, and if you're not careful the number of paths goes up faster than they're culled.

In crowbar, the basic interface is an n-way map operation rather than monadic bind, which gives an important advantage: we know the branching factor at each node in the tree, so in principle we should be able to make the probabilities go down fast enough to generate moderately deep, balanced-ish trees and yet terminate with probability 1. It's a bit tricky, though.

gasche commented 6 years ago

For the record, the way I deal with size in random-generator is to use fuel instead of size. On each tree node (even unary nodes), I decrease the fuel by one, and then I split it randomly among all children. (My implementation actually only support zero-ary, unary and binary nodes only.)

The binary split (after decreasing the fuel) is as follows:

let split_int n r =
  let k = Random.State.int r (n + 1) in
  (k, n - k)

My experience with random-generator, compared to height-based controls that I tried first, is that this fuel-repartition approach returns much nicer, human-looking trees/terms.