janet-lang / spork

Various Janet utility modules - the official "Contrib" library.
MIT License
117 stars 35 forks source link

Feature: add Gaussian sampler #169

Closed hyiltiz closed 7 months ago

hyiltiz commented 8 months ago
sogaiu commented 8 months ago

This line doesn't look right.

I don't think the code is valid:

  (math/rng-uniform ))

is problematic in at least 2 ways.

First it's missing an argument:

    (math/rng-uniform rng)

    Extract a random number in the range [0, 1) from the RNG.

and second there is an extra closing parentheses.

I'm guessing it's just stray code that would be better removed?

sogaiu commented 8 months ago

Not sure what's better, but may be forever could be used like this:

(defn rand-gaussian
  "Get a random sample from the standard Gaussian distribution.
  Optionall specify the mean m and the standard deviation sd. E.g.:
  (rand-gaussian) # => 0.1324...
  (rand-gaussian 5 0.1) # => 5.3397...
  "
  [&opt m sd]
  (default [m sd] [0 1])
  (defn scale [x] (+ m (* sd x)))

  (forever
    (def p (rand-uniform))
    (def q (rand-uniform))

    # We use the Box-Muller transform
    (let [rho (math/sqrt (* -2 (math/log q)))
          theta (* 2 math/pi p)
          _box    (* rho (math/cos theta))
          _muller (* rho (math/sin theta))
          box    (scale _box)
          muller (scale _muller)]

      (yield box)
      (yield muller))))
sogaiu commented 8 months ago

Regarding the testing comment above, the examples in the docstrings could be broken out as tests, though checking the return values might require some care.

Possibly there are better ways, but here is what I did in a PR for adding tests for randgen. The approach involves using set-seed before evaluating so that results are reproducible.

hyiltiz commented 8 months ago
hyiltiz commented 8 months ago

Currently, we have the following problems:

@sogaiu proposed an alternative implementation using a "closure"-like mechanism.

My original implementation via fibers (using yield) couldn't satisfy PC. I think this implementation satisfies them all.

sogaiu commented 8 months ago

Below is another attempt:

(def rand-gaussian
  ``
  Get a random sample from the standard Gaussian distribution. 
  Optionally specify the mean `m` and the standard deviation `sd`.
  ``
  (do
    (defn make-rg
      [&opt m sd]
      (def default-m 0)
      (default m default-m)
      (def default-sd 1)
      (default sd default-sd)
      (def state @{:m m :sd sd})
      (defn scale
        [x]
        (+ (get state :m)
           (* (get state :sd) x)))

      (defn box-muller
        []
        (let [p (math/rng-uniform (get-rng))
              q (math/rng-uniform (get-rng))
              rho (math/sqrt (* -2 (math/log q)))
              theta (* 2 math/pi p)
              box (scale (* rho (math/cos theta)))
              muller (scale (* rho (math/sin theta)))]
          [box muller]))

      (def [box muller] (box-muller))
      (-> state (put :box box) (put :muller muller))

      (fn rg [&opt m sd]
        (cond
          # "reset" if arguments do not match cached values
          (not (and (= m (get state :m default-m))
                    (= sd (get state :sd default-sd))))
          (do
            (def m (or m default-m))
            (def sd (or sd default-sd))
            (-> state (put :m m) (put :sd sd))
            (def [box muller] (box-muller))
            (-> state (put :box box) (put :muller muller))
            (rg m sd))
          #
          (def box (get state :box))
          (do
            (put state :box nil)
            box)
          #
          (def muller (get state :muller))
          (do
            (def [next-box next-muller] (box-muller))
            (-> state (put :box next-box) (put :muller next-muller))
            muller)
          #
          (errorf "unexpected state"))))
    #
    (make-rg)))

(comment

  (set-seed 1)

  (rand-gaussian)
  # =>
  0.0917440069385442

  (sample-n |(rand-gaussian 5 0.1) 3)
  # =>
  @[4.97222830484943 5.10069692026214 5.13979794451211]

  (rand-gaussian)
  # =>
  -0.856296446506397

 )
sogaiu commented 7 months ago

@hyiltiz and I are comparing the implementation above with his simpler one:

(defn rand-gaussian [&opt m sd]
  (default m 0)
  (default sd 1)
  (defn scale [x] (+ m (* sd x)))

  (def p (math/rng-uniform (get-rng)))
  (def q (math/rng-uniform (get-rng)))

  # We use the Box-Muller transform
  (let [rho (math/sqrt (* -2 (math/log q)))
        theta (* 2 math/pi p)
        _box (* rho (math/cos theta))
        box (scale _box)]
    box))

This latest one is faster and simpler:

(set-seed 1)

(timeit-loop [:timeout 1] "simpler" (rand-gaussian-simpler))

(set-seed 1)

(timeit-loop [:timeout 1] "complex" (rand-gaussian-complex))

# simpler 1.000s, 0.485µs/body
# complex 1.000s, 0.8271µs/body