Gen.(ui32,ui64) unsigned?

jmid commented 4 years ago

The current documentation for Gen.(ui32,ui64) http://c-cube.github.io/qcheck/0.13/qcheck-core/QCheck/Gen/index.html#val-ui32 says: "Generates (unsigned) int32/int64 values."

However their output does not look unsigned to me:

utop # Gen.generate ~n:10 Gen.ui32;;
- : int32 list =
[-603729975l; -1392289758l; -541684379l; -671722106l; 734220326l; -1534963290l;
 -1965668011l; -1794636676l; -1742997582l; 310327455l]
utop # Gen.generate ~n:10 Gen.ui64;;
- : int64 list =
[7619769240142337748L; -8825492657156686074L; -9195090798551082499L;
 6170233654447026723L; 155076194192071921L; -9131533943366183247L;
 2864414624886596057L; 6981268389106635498L; -6146802381508222341L;
 3496805997001729452L]

Do you agree? If so, we should revise the documentation (and perhaps the names).

c-cube commented 4 years ago

If you have time, could you instead:

rename current implementation to i32/i64
map Int32.abs (resp. Int64.abs) to them to define ui32/ui64?

jmid commented 4 years ago

I'll be happy to help with this (once I clear off a few other things at my end) and your name suggestions make good sense. I'm reluctant to go with abs though: it is a folklore example of not satisfying the natural property fun i -> abs i >= 0 that everyone expects:

# Int32.abs (Int32.min_int);;
- : int32 = -2147483648l

(2-complement representation, the gift that keeps on giving...) 😉

c-cube commented 4 years ago

heh, fair enough, although I guess min_int can be dealt with separately. No hurry for addressing this.

sir4ur0n commented 3 years ago

FYI in https://github.com/c-cube/qcheck/pull/109 I create int32 and int64 but keep ui32 and ui64 as deprecated aliases (backward compatibility for now) and fix the documentation.

This doesn't solve the problem of providing unsigned (positive) such generators for now, but I will quote a comment I left on the pint implementation:

  (* Uniform positive random int generator.

     We can't use {!Random.State.int} because the upper bound must be positive and is excluded,
     so {!Int.max_int} would never be reached. We have to manipulate bits directly.

     Note that the leftmost bit is used for negative numbers, so it must be [0].

     {!Random.State.bits} only generates 30 bits, which is exactly enough on
     32-bits architectures (i.e. {!Sys.int_size} = 31, i.e. 30 bits for positive numbers)
     but not on 64-bits ones.

     That's why for 64-bits, 3 30-bits segments are generated and shifted to craft a
     62-bits number (i.e. {!Sys.int_size} = 63). The leftmost segment is masked to keep
     only the last 2 bits.

     The current implementation hard-codes 30/32/62/64 values, but technically we should
     rely on {!Sys.int_size} to find the number of bits.

     Note that we could also further generalize this function to merge it with [random_binary_string].
     Technically this function is a special case of [random_binary_string] where the size is
     {!Sys.int_size}.
  *)

TL;DR: I think a generic (internal) function that takes a bit size and returns an int64 (which can always be cast down to int / int32) could be reused for a variety of generators (signed + positive, negative by toying with the leftmost bit)

c-cube commented 3 years ago

Using the implementation I propose in https://github.com/c-cube/qcheck/pull/109 , namely:

let int32 st =
  let a = RS.int32 (1 lsl 16) in
  let b = RS.int32 (1 lsl 16) in
  (a lsl 16) lor b

it seems like the unsigned version, i.e the version where the MSB is 0, would be pretty similar:

let uint32 st =
  let a = RS.int32 (1 lsl 15) in
  let b = RS.int32 (1 lsl 16) in
  (a lsl 16) lor b

Is that correct?

c-cube / qcheck

Gen.(ui32,ui64) unsigned? #90