proper-testing / proper

PropEr: a QuickCheck-inspired property-based testing tool for Erlang
http://proper-testing.github.io
GNU General Public License v3.0
879 stars 168 forks source link

Too small numbers produced by range(min, max) generator #217

Closed SylwBar closed 4 years ago

SylwBar commented 4 years ago

There is a define that differentiates how integer(min, max) generator works: It is defined in proper_internal.hrl -define(SMALL_RANGE_THRESHOLD, 16#FFFF).

I created a single program in Elixir/PropCheck that expose this: (This is not problem of PropCheck, but it was easier for me to create example program.)

defmodule PropTest do
  use PropCheck

  def test(numtests, max_int, max_size \\ 42) do
    quickcheck(
      forall {size, int} <- sized(s, {s, integer(0, max_int)}) do
        IO.puts("size=#{size}, int=#{int}")
        true
      end,
      [{:numtests, numtests}, {:max_size, max_size}]
    )
  end
end

For ranges below the threshold generated numbers are not affected by size parameter:

iex()> PropTest.test 11, 65000
size=1, int=46928
.size=5, int=5987
.size=9, int=9559
.size=13, int=26090
.size=17, int=37666
.size=21, int=11847
.size=25, int=52080
.size=29, int=14958
.size=33, int=51134
.size=37, int=12208
.size=41, int=14521

However, for ranges greater than ?SMALL_RANGE_THRESHOLD, this function should work in other way. According to proper_arith.erl: %% When the range is large, skew the distribution to be more like that of an %% unbounded random integer.

The problem is that numbers generated this way are very low compared to range. Here is example that shows that even for sizes close to max_size = 42 (default).

iex()> PropTest.test 11, 66000     
size=1, int=0
.size=5, int=3
.size=9, int=1
.size=13, int=36
.size=17, int=31
.size=21, int=19
.size=25, int=69
.size=29, int=43
.size=33, int=11
.size=37, int=20
.size=41, int=14

If we tweak max_size to bigger value:

iex()> PropTest.test 11, 66000, 66000  
size=1, int=0
.size=6600, int=8452
.size=13199, int=44382
.size=19798, int=45498
.size=26397, int=3889
.size=32996, int=52179
.size=39595, int=39206
.size=46194, int=6253
.size=52793, int=13610
.size=59392, int=19333
.size=65991, int=37919

We will get bigger numbers, but this is not workaround, because max_size should be different for each range. For the same max_size bigger range will generate too small numbers:

iex()> PropTest.test 11, 10000000, 66000
size=1, int=0
.size=6600, int=5513
.size=13199, int=6105
.size=19798, int=20774
.size=26397, int=21579
.size=32996, int=99976
.size=39595, int=33700
.size=46194, int=48458
.size=52793, int=51828
.size=59392, int=33111
.size=65991, int=71285

I don't know how to generate bigger numbers using integer(min, max) generator. I guess this is bug in code - max_size parameter should be also taken into account when large integer numbers are generated.

kostis commented 4 years ago

I created a single program in Elixir/PropCheck that expose this: (This is not problem of PropCheck, but it was easier for me to create example program.)

It's also easier for me to reply in Greek rather than English, but you probably would not like that, right? PropEr is a Property-Based Testing tool for Erlang.

Either supply an Erlang program that shows the problem you are experiencing (so that we can easily reproduce and debug this) or report this in the PropCheck repository instead.

SylwBar commented 4 years ago

Hi. Sorry for including Elixir, this problem is not related to PropCheck. I not so familiar with Proper on Erlang but I quickly created simple property using rebar3 proper plugin.

I simplified the program - now maximum range() parameter is put directly: Here is result for 65000:

-module(prop_int_gen_test).
-include_lib("proper/include/proper.hrl").

prop_test() ->
    ?FORALL({Size, Num}, ?SIZED(S, {S, range(0, 65000)}),
        begin
            io:format("size=~p, int=~p~n", [Size, Num]),
            true
        end).

rebar3 proper result:

...
...
.size=36, int=52965
.size=37, int=50011
.size=37, int=56778
.size=38, int=4044
.size=38, int=52919
.size=39, int=47281
.size=39, int=14879
.size=40, int=1297
.size=40, int=58915
.size=41, int=26404
.size=41, int=25704
.size=42, int=13219
.size=42, int=25637
.
OK: Passed 100 test(s).

This works fine. Generated numbers are not tied to internal size parameter.

Here is result for 66000:

===> Verifying dependencies...
===> Compiling pbt
===> Testing prop_int_gen_test:prop_test()
size=1, int=0
.size=1, int=0
.size=1, int=1
.size=2, int=0
.size=2, int=3
.size=2, int=0
.size=3, int=0
.size=3, int=0
.size=3, int=1
.size=4, int=0
.size=4, int=4
.size=4, int=0
.size=5, int=0
.size=5, int=11
.size=5, int=1
.size=6, int=0
.size=6, int=16
.size=6, int=2
.size=7, int=3
.size=7, int=3
.size=7, int=4
.size=8, int=5
.size=8, int=5
.size=8, int=8
.size=9, int=0
.size=9, int=33
.size=9, int=36
.size=10, int=5
.size=10, int=5
.size=10, int=4
.size=11, int=9
.size=11, int=3
.size=11, int=18
.size=12, int=3
.size=12, int=32
.size=12, int=3
.size=13, int=0
.size=13, int=2
.size=13, int=2
.size=14, int=13
.size=14, int=27
.size=14, int=1
.size=15, int=2
.size=15, int=6
.size=15, int=11
.size=16, int=113
.size=16, int=29
.size=16, int=54
.size=17, int=11
.size=17, int=10
.size=18, int=19
.size=18, int=8
.size=19, int=28
.size=19, int=9
.size=20, int=0
.size=20, int=106
.size=21, int=7
.size=21, int=11
.size=22, int=10
.size=22, int=0
.size=23, int=18
.size=23, int=13
.size=24, int=16
.size=24, int=4
.size=25, int=17
.size=25, int=3
.size=26, int=31
.size=26, int=9
.size=27, int=23
.size=27, int=17
.size=28, int=6
.size=28, int=33
.size=29, int=14
.size=29, int=1
.size=30, int=22
.size=30, int=50
.size=31, int=11
.size=31, int=1
.size=32, int=16
.size=32, int=21
.size=33, int=40
.size=33, int=27
.size=34, int=51
.size=34, int=27
.size=35, int=21
.size=35, int=22
.size=36, int=18
.size=36, int=46
.size=37, int=340
.size=37, int=24
.size=38, int=11
.size=38, int=108
.size=39, int=6
.size=39, int=31
.size=40, int=5
.size=40, int=85
.size=41, int=60
.size=41, int=29
.size=42, int=30
.size=42, int=66
.
OK: Passed 100 test(s).

When ?SMALL_RANGE_THRESHOLD is crossed generated numbers are very low. They are scaled by "size" parameter but I guess for size equal to max_size generated numbers should be comparable to max parameter for range.

kostis commented 4 years ago

I think it's related to this commit (https://github.com/proper-testing/proper/commit/3c2ce0ea6664848a24c85284301118d75ac1d580). I will investigate, but help from the original committer (@manopapad) is appreciated here.

manopapad commented 4 years ago

As far as I remember we wanted to follow a different sampling strategy depending on the integer range the user chose: We assumed that if you were using a small range like 1..20, then the full range of values was meaningful, so the range should be sampled uniformly. Conversely, if you were using a range like 1..2M we assumed you probably wanted the same behavior as non_neg_int, where values closer to the lower limit are more likely to be returned, and increasing the size parameter (as testing continues) broadens the sampling. Therefore this is expected behavior. What is your issue with this behavior?

SylwBar commented 4 years ago

I understand how it was supposed to work. My tests show that even at the end of a series of tests, when the parameter "size" is large, the values generated are small.

manopapad commented 4 years ago

I see. We chose to have the growing behavior of integer(0,1M) match that of non_neg_int (which generally skews low), and the upper bound is only used as a filter. This is just what we (arbitrarily) chose to do for this generator, and there are certainly other ways to do it, none of which is more "right" than the others.

If you need a bounded generator that skews higher, then you could try something like ?SIZED(Size, resize(Size * 10, range(0,1M))) for an integer that grows 10x faster, or you could write a custom generator, or you could set ?SMALL_RANGE_THRESHOLD to a value more appropriate for your needs.

SylwBar commented 4 years ago

I use rounded float/2 generator output which is uniformly random and it works like a charm.

This artificial "size" boosting required for reaching upper bounds is not acceptable for me - in fact it is not clear what value I should put instead of 10 to reach upper bound.

Overall, experience with integer/2 generator was very confusing, especially because integer/2 generator documentation is misleading: "All integers between Low and High, bounds included"