c-cube / qcheck

QuickCheck inspired property-based testing for OCaml.
https://c-cube.github.io/qcheck/
BSD 2-Clause "Simplified" License
347 stars 36 forks source link

Support stats with negative integers #40

Closed jmid closed 5 years ago

jmid commented 7 years ago

Consider the following interaction to observe the distribution of the builtin integer generators:

# let t = Test.make (add_stat ("dist", fun x -> x) small_nat) (fun _ -> true);;
val t : QCheck.Test.t = QCheck.Test.Test <abstr>
generated error  fail pass / total  time(s) name
[✓]  100    0     0    100 /  100      0.0  anon_test_8                                             
stats dist:
  num: 100, avg: 332.98, stddev: 1220.61, median 9, min 0, max 8845
            0-442: #######################################################          89
          443-885: ###                                                               5
         886-1328: #                                                                 2
        1329-1771:                                                                   0
        1772-2214:                                                                   0
        2215-2657:                                                                   0
        2658-3100:                                                                   0
        3101-3543:                                                                   1
        3544-3986:                                                                   0
        3987-4429:                                                                   0
        4430-4872:                                                                   0
        4873-5315:                                                                   1
        5316-5758:                                                                   0
        5759-6201:                                                                   0
        6202-6644:                                                                   1
        6645-7087:                                                                   0
        7088-7530:                                                                   0
        7531-7973:                                                                   0
        7974-8416:                                                                   0
        8417-8859:                                                                   1
        8860-9302:                                                                   0
================================================================================
success (ran 1 tests)
- : int = 0

Now, it would be nice to similarly inspect a sample of small_signed_int (or other signed generators for that matter). However if one attempts to do so you get:

# let t = Test.make (add_stat ("dist",fun x -> x) small_signed_int) (fun _ -> true);;
val t : QCheck.Test.t = QCheck.Test.Test <abstr>
# QCheck_runner.run_tests ~verbose:true [t];;                           
generated error  fail pass / total  time(s) name
[ ]    0    0     0      0 /  100      0.0  anon_test_10 (collecting)Exception: Assert_failure ("src/QCheck.ml", 1188, 9).

(where the linenumber naturally differs with the particular QCheck version), which indicates that statistics are limited to non-negative integers. In the above case, I would love to see a nice bell-curve centered around 0 as output :-)

This will require some more program logic (for non-negative stats only split into "positive buckets", while mixed positive and negative stats should split into "mixed buckets", ...).

c-cube commented 7 years ago

I have a patch that outputs the following disposition. It doesn't solve the performance issue though…

Stat for test stats_neg:

stats dist:
  num: 5000, avg: -6.66, stddev: 1301.56, median 0, min -9852, max 9922
     -9852..-8864:                                                                  14
     -8863..-7875:                                                                   9
     -7874..-6886:                                                                  21
     -6885..-5897:                                                                   7
     -5896..-4908:                                                                   9
     -4907..-3919:                                                                  15
     -3918..-2930:                                                                  14
     -2929..-1941:                                                                  12
      -1940..-952:                                                                  41
         -951..37: #######################################################        3893
         38..1026: ############                                                    860
       1027..2015:                                                                  14
       2016..3004:                                                                   9
       3005..3993:                                                                   8
       3994..4982:                                                                  12
       4983..5971:                                                                  16
       5972..6960:                                                                  10
       6961..7949:                                                                   9
       7950..8938:                                                                  14
       8939..9927:                                                                  13
      9928..10916:                                                                   0
c-cube commented 7 years ago

@jmid can you test this a bit? then I'll close the issue :-)

jmid commented 7 years ago

Sorry, for the delay. I already tested it a bit and it seems to work great! I then digressed into adjusting the weights of the small_signed_int generator because it didn't quite have the bell-curve shape output I was expecting. That I was able to do so, is testament to how useful it already is.

I noticed that the division into buckets varies on a per-sample basis, which makes it harder to compare the output of consecutive samples. In the situation above (varying the small_signed_int distribution) being able to do so would also be nice. However I'm not sure what can be done about it (fixing lowest/upper bucket to start/end at a power of 10 or 2?). One can probably not assume that things are either (a) entirely positive or (b) symmetric around 0. A less drastic chance (for sufficiently wide samples) could be to simply round buckets and bucket-sizes to the nearest 10 (or 20, or 50, or 100, or ...). It just seems more aesthetic to us humans to avoid bucket divisions such as the following:

         -951..37: #######################################################        3893
c-cube commented 7 years ago

Indeed, that might be helpful. Maybe I can round the lower bound down to the next multiple of 10.

c-cube commented 7 years ago

please try again ;-)

jmid commented 7 years ago

With the latest version I get:

# let t = Test.make ~count:1000 (add_stat ("dist",fun x -> x) small_signed_int) (fun _ -> true);;
val t : QCheck.Test.t = QCheck.Test.Test <abstr>
# QCheck_runner.run_tests ~verbose:true [t];;
random seed: 259588188
generated error fail pass / total     time test name
[✓] 1000    0    0 1000 / 1000     0.0s anon_test_1

+++ Stat ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Stat for test anon_test_1:

stats dist:
  num: 1000, avg: -39.57, stddev: 2176.10, median 0, min -9820, max 9927
     -9820..-8833:                                                                   6
     -8832..-7845:                                                                   7
     -7844..-6857:                                                                   4
     -6856..-5869:                                                                   5
     -5868..-4881:                                                                  10
     -4880..-3893: #                                                                14
     -3892..-2905: #                                                                19
     -2904..-1917: #                                                                13
      -1916..-929: ##                                                               27
         -928..59: #######################################################         591
         60..1047: ####################                                            215
       1048..2035: ##                                                               22
       2036..3023: #                                                                16
       3024..4011: #                                                                13
       4012..4999:                                                                  10
       5000..5987:                                                                   5
       5988..6975:                                                                   7
       6976..7963:                                                                   2
       7964..8951:                                                                   7
       8952..9939:                                                                   6
      9940..10927:                                                                   0
================================================================================
success (ran 1 tests)
- : int = 0

meaning the lowest bucket consistently starts at a multiple of 10. This is with the following adjusted definition of nat:

  let nat st =
    let p = RS.float st 1. in
    if p < 0.25 then RS.int st 10
    else if p < 0.5 then RS.int st 100
    else if p < 0.8 then RS.int st 1_000
    else if p < 0.9 then RS.int st 5_000
    else RS.int st 10_000

where I have been playing with frequencies.

c-cube commented 7 years ago

Oh damn, I only shifted the lowest bucket without thinking about making the width of each bucket a multiple. I'm not sure if this is worth the complication… But suggestions about what to do are welcome!

jmid commented 7 years ago

Yes, there are probably no easy solutions to this one :-|

c-cube commented 7 years ago

We'll solve this later, I think — it's not critical.

jmid commented 5 years ago

Consider the statistics output arising from the following:

open QCheck

let mk_stat_test label g =
  Test.make ~name:label ~count:10_000 (add_stat (label,fun i -> i) g) (fun _ -> true);;

let tsmall_nat = mk_stat_test "small nat" (make Gen.small_nat);;
let tnat       = mk_stat_test "nat" (make Gen.nat);;
let tbig_nat   = mk_stat_test "big nat" (make Gen.big_nat);;
let tint       = mk_stat_test "int" int;;
let tint_range = mk_stat_test "int_range" (int_range (-13) 25333);;
let tint_range' = mk_stat_test "int_range'" (int_range 0 1000);;
let tint_range'' = mk_stat_test "int_range''" (int_range 0 19);;
let tint_range''' = mk_stat_test "int_range''" (int_range (-10) 10);;

QCheck_runner.set_seed 123;;

QCheck_runner.run_tests
  [tsmall_nat;
   tnat;
   tbig_nat;
   tint;
   tint_range;
   tint_range';
   tint_range'';
   tint_range''';
  ];;

e.g., for tint:

+++ Stat ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Stat for test int:

stats int:
  num: 10000, avg: -6657218965481711.00, stddev: 2669577452265829888.00, median -40502661326582784, min -4610308102123514237, max 4611245582821469975
  -4610308102123514237..-4149230417876265022: ###################################################             509
  -4149230417876265021..-3688152733629015806: ###################################################             509
  -3688152733629015805..-3227075049381766590: ##################################################              498
  -3227075049381766589..-2765997365134517374: ################################################                475
  -2765997365134517373..-2304919680887268158: ###################################################             510
  -2304919680887268157..-1843841996640018942: ##################################################              503
  -1843841996640018941..-1382764312392769726: ###################################################             509
  -1382764312392769725..-921686628145520510: ################################################                484
  -921686628145520509..-460608943898271294: ####################################################            518
  -460608943898271293..468740348977922: #####################################################           529
  468740348977923..461546424596227138: ###################################################             509
  461546424596227139..922624108843476354: ##################################################              504
  922624108843476355..1383701793090725570: #############################################                   449
  1383701793090725571..1844779477337974786: ####################################################            516
  1844779477337974787..2305857161585224002: ###################################################             509
  2305857161585224003..2766934845832473218: #############################################                   453
  2766934845832473219..3228012530079722434: ##################################################              496
  3228012530079722435..3689090214326971650: ###############################################                 474
  3689090214326971651..4150167898574220866: ##################################################              502
  4150167898574220867..4611245582821470082: #######################################################         544
  4611245582821470083..-4151048769786056510:                                                                   0

There are a few issues here:

This commit https://github.com/jmid/qcheck/commit/82b62f89043763537284aa3232e21b40a3badc2b?diff=unified#diff-efb5bad18994b9ea565d883939945aea addresses these three

Here's a corresponding output with the patch:

stats int:
  num: 10000, avg: -13407412135073796.00, stddev: 2664493770929318912.00, median 3299561771496807, min -4610808991690809906, max 4611554652002281884
  -4610808991690809906..-4149690809506155315: #######################################################         529
  -4149690809506155314..-3688572627321500723: ###################################################             491
  -3688572627321500722..-3227454445136846131: ####################################################            501
  -3227454445136846130..-2766336262952191539: ####################################################            505
  -2766336262952191538..-2305218080767536947: #################################################               476
  -2305218080767536946..-1844099898582882355: #####################################################           515
  -1844099898582882354..-1382981716398227763: ######################################################          520
  -1382981716398227762.. -921863534213573171: ###################################################             491
   -921863534213573170.. -460745352028918579: ##################################################              482
   -460745352028918578..     372830155736013: ##################################################              486
       372830155736014..  461491012340390605: ###################################################             495
    461491012340390606..  922609194525045197: ####################################################            504
    922609194525045198.. 1383727376709699789: #####################################################           513
   1383727376709699790.. 1844845558894354381: ####################################################            505
   1844845558894354382.. 2305963741079008973: ####################################################            501
   2305963741079008974.. 2767081923263663565: #####################################################           516
   2767081923263663566.. 3228200105448318157: ##################################################              484
   3228200105448318158.. 3689318287632972749: ##################################################              486
   3689318287632972750.. 4150436469817627341: ###################################################             500
   4150436469817627342.. 4611554652002281933: ###################################################             500
c-cube commented 5 years ago

@jmid would you consider opening a PR for that?

jmid commented 5 years ago

Hm. I see the commit is already on master, e.g., here: https://github.com/c-cube/qcheck/blob/b14291a9289a85ac657bcc08ee6c572294fcdaf7/src/core/QCheck.ml#L1568 (probably a consequence of me having committed both to my own fork - sorry about that), so I'm not sure a pull request makes sense anymore -- but then again, I'm the git novice so I may very well be wrong... :-D