uncomplicate / neanderthal

Fast Clojure Matrix Library
http://neanderthal.uncomplicate.org
Eclipse Public License 1.0
1.06k stars 56 forks source link

Split `random.cl` into two files, one per precision #114

Closed futuro closed 3 years ago

futuro commented 3 years ago

Not all GPUs support double precision, so we're going to split the random.cl file in two, one per precision, so that our opencl programs compile on GPUs which only support single precision.

This PR started as issue #31 in clojurecl. Let me know if you'd like me to open a companion issue in this repo. I'm going to close the clojurecl issue now.

I've tested this successfully with the following snippet, though this PR does not contain any changes to test files, as my GPU doesn't support double precision so I can't run the tests.

(ns example
  (:require [uncomplicate.commons.core :as ccore]
            [uncomplicate.clojurecl.core :as opencl]
            [uncomplicate.neanderthal
             [core :as ncore]
             [opencl :as cl]
             [random :as nrandom]]
            [criterium.core :as crit]))

(opencl/with-default
  (cl/with-default-engine
    (ccore/with-release [gpu-a (nrandom/rand-uniform! (cl/clge 256 256))
                         gpu-b (nrandom/rand-uniform! (cl/clge 256 256))
                         gpu-c (ncore/zero gpu-a)]
      (crit/with-progress-reporting
        (crit/quick-bench
         (do (ncore/mm! (float 1.0) gpu-a gpu-b (float 0.0) gpu-c)
             (opencl/finish!)))))))
blueberry commented 3 years ago

This seems good on visual check, and also passes automatic tests, so I think it's good to go. Thank you!

blueberry commented 3 years ago

Version 0.42.0 is in Clojars. Please confirm that it works for you out of the box.

futuro commented 3 years ago

@blueberry Yep, works perfectly!