I have some results from tests of both precisions. Two measurements of errors has been carried out. Both of them are the comparison of every single element in the reuslt of the GPU implementation [gpu], and the original python implementation [cpu].
Notice: the errors of float32 are comparisons between CUDA double and Python single precision, since python functions, e.g. scipy.spatial.distance do not yield float32 output, regardless its inputs data type.
I have some results from tests of both precisions. Two measurements of errors has been carried out. Both of them are the comparison of every single element in the reuslt of the GPU implementation [gpu], and the original python implementation [cpu].
absolute error = abs(cpu_i - gpu_i) relative error = abs( (cpu_i - gpu_i)/cpu_i )
_Following results are produced by unit_test2.py_
Notice: the errors of
float32
are comparisons between CUDA double and Python single precision, since python functions, e.g.scipy.spatial.distance
do not yield float32 output, regardless its inputs data type.mu float64
absolute_val 0.21 error_absolute 8.1e-13 [average] 3.5e-12 [max] error_relative 8.3e-12 [average] 3.8e-10 [max]mu float32
absolute_val 0.21 error_absolute 0.00032 [average] 0.0014 [max] error_relative 0.0031 [average] 0.15 [max]var float64
absolute_val 5.5e-07 error_absolute 1e-10 [average] 6.2e-10 [max] error_relative 0.00055 [average] 0.006 [max]var float32
absolute_val 5.5e-07 error_absolute 0.1 [average] 0.55 [max] error_relative 5.4e+05 [average] 5.9e+06 [max]deriv float64
absolute_val 0.088 error_absolute 1.3e-13 [average] 3.8e-12 [max] error_relative 5.6e-12 [average] 4.6e-09 [max]deriv float32
absolute_val 0.088 error_absolute 4.1e-05 [average] 0.00094 [max] error_relative 0.0014 [average] 1 [max]