Prime factorization? - Githubissues

primo-ppcg commented 1 year ago

With the merger of #145, I think we have just about everything needed. I've prepared a branch which adds math/factor for the purpose of evaluation.

Using this test script, I get timings like this:

pseudoprimes   1.003s, 47.74µs/body
small integers 0.469s, 4.688µs/body

Which is definitely fast enough to be of use.

One issue that I currently have is that math/gcd does not work with abstract types, although it does appear to be written by hand (and not a direct math.h function), so it could possibly be updated. I also think that the pollard-rho implementation could be cleaned up a bit.

sogaiu commented 1 year ago

Below are the times I got when using janet 1.30.0:

$ JANET_PATH=~/src/spork.primo-ppcg/jpm_tree/lib janet ~/scratch/test-factor.janet
pseudoprimes   1.002s, 82.27µs/body
small integers 0.603s, 6.029µs/body

FWIW, results were similar with janet's 7049f658.

/me is looking forward to upgrading hardware...

primo-ppcg commented 1 year ago

After https://github.com/janet-lang/janet/pull/1249 this became about 15% faster:

pseudoprimes   1.004s, 41.61µs/body
small integers 0.331s, 3.308µs/body

and the strange thing is, I don't even use any compare functions... not directly anyway.

sogaiu commented 1 year ago

With that janet (7475362c), I got:

$ JANET_PATH=~/src/spork.primo-ppcg/jpm_tree/lib janet ~/scratch/test-factor.janet
pseudoprimes   1.004s, 75.69µs/body
small integers 0.432s, 4.316µs/body

So there appears to be some speedup here too.

primo-ppcg commented 1 year ago

I've updated to handle int/s64 and int/u64:

> (factor 123456789123)
@[3 12049 3415409]
> (factor (int/s64 123456789123))
@[<core/s64 3> <core/s64 12049> <core/s64 3415409>]

It has become a bit slower, but that was expected. Not too bad, though:

pseudoprimes   1.001s, 42.47µs/body
small integers 0.428s, 4.278µs/body

The merge sort became a bit messy, because compare< doesn't handle nil elegantly.

(defn- merge-sorted
  "merge two sorted arrays"
  [a b]
  (def len (+ (length a) (length b)))
  (def c (array/new-filled len))
  (var [i j] [0 0])
  (forv k 0 len
    (let [u (get a i)
          v (get b j)]
      (if (not= nil u)
        (if (not= nil v)
          (if (compare< u v)
            (do (put c k u) (++ i))
            (do (put c k v) (++ j)))
          (do (put c k u) (++ i)))
        (do (put c k v) (++ j)))))
  c)

In particular, the repetition is a somewhat ugly. If someone could help clean that up, that'd be really nice :laughing:

I think that merge-sorted and merge-sorted-by could be nice complements to insert-sorted and insert-sorted-by in spork/misc.

sogaiu commented 1 year ago

Didn't have luck reducing duplication, but possibly the following is a bit easier to read?

(defn- merge-sorted-2
  "merge two sorted arrays"
  [a b]
  (def len (+ (length a) (length b)))
  (def c (array/new-filled len))
  (var [i j] [0 0])
  (forv k 0 len
    (def [u v] [(get a i) (get b j)])
    (cond
      (nil? u) (do (put c k v) (++ j))
      (nil? v) (do (put c k u) (++ i))
      (compare< u v) (do (put c k u) (++ i))
      (do (put c k v) (++ j))))
  c)

or may be without the repeated destructuring in the loop could be better?

primo-ppcg commented 1 year ago

Definitely nicer to look at :+1:

or may be without the repeated destructuring in the loop could be better?

Destructuring is already compiler optimized :smiley:

sogaiu commented 1 year ago

I had another version which is longer and not as nice to view, but may be it does less nil-related comparisons sometimes...

(defn- merge-sorted-3
  "merge two sorted arrays"
  [a b]
  (def len (+ (length a) (length b)))
  (def c (array/new-filled len))
  (var [i j k] [0 0 0])
  (var [u v] [(get a i) (get b j)])
  (while (and (not= nil u) (not= nil v))
    (if (compare< u v)
      (do 
        (put c k u) 
        (++ i)
        (set u (get a i)))
      (do 
        (put c k v) 
        (++ j)
        (set v (get b j))))
    (++ k))
  (cond
    (nil? u)
    (while (< k len)
      (put c k v) 
      (++ j)
      (set v (get b j))
      (++ k))
    #
    (nil? v)
    (while (< k len)
      (put c k u) 
      (++ i)
      (set u (get a i))
      (++ k)))
  c)

Possibly not worth it.

pepe commented 1 year ago

I think that merge-sorted and merge-sorted-by could be nice complements to insert-sorted and insert-sorted-by in spork/misc.

I am sure it would be excellent :-).

primo-ppcg commented 1 year ago

Possibly not worth it.

Switching nil? for = nil is a more significant performance gain :wink:

sogaiu commented 1 year ago

If there were an array/blit (cf. buffer/blit), may be one could have done something like:

(defn- merge-sorted-4
  "merge two sorted arrays"
  [a b]
  (def len (+ (length a) (length b)))
  (def c (array/new-filled len))
  (var [i j k] [0 0 0])
  (var [u v] [(get a i) (get b j)])
  (while (and (not= nil u) (not= nil v))
    (if (compare< u v)
      (do 
        (put c k u) 
        (++ i)
        (set u (get a i)))
      (do 
        (put c k v) 
        (++ j)
        (set v (get b j))))
    (++ k))
  (when (< k len)
    (cond
      (not= nil v) (array/blit c a k i)
      (not= nil u) (array/blit c b k j)))
  c)

sogaiu commented 1 year ago

Switching nil? for = nil is a more significant performance gain

Often torn by these sorts of things because I find nil? (and zero?, one?, etc.) easier to read and less likely to mistype.

Still, I admit to not measuring (^^;

primo-ppcg commented 1 year ago

Often torn by these sorts of things because I find nil? (and zero?, one?, etc.) easier to read and less likely to mistype.

I think there's a reasonable expectation that core functions (and even "official contrib" functions) are a written approximately as efficiently as possible, i.e. a user likely wouldn't be able to write anything more efficient by hand. If I find that isn't the case, I make a PR about it :smiley:

pepe commented 1 year ago

Great work @primo-ppcg !

sogaiu commented 1 year ago

If there were an array/blit (cf. buffer/blit)

Was looking at @MikeBeller's janet-benchmarksgame for reasons (TM), and noticed this text:

An equivalent of buffer/blit for arrays (e.g. array/blit) would help as a lot of benchmarks involve moving things between arrays, and if you use Janet's indexed combinators, they are always allocating new arrays. (See fannkuch) array/reverse-in would also be great.

via https://github.com/MikeBeller/janet-benchmarksgame#ideas-for-improving-janet-performance

janet-lang / spork

Prime factorization? #147