scalanlp / breeze

Breeze is a numerical processing library for Scala.
www.scalanlp.org
Apache License 2.0
3.44k stars 691 forks source link

Is there a reason why Single Precision versions of functions like leastSquares not exposed? #855

Open dineshdharme opened 9 months ago

dineshdharme commented 9 months ago

As the title says, it would be good for single precision versions which make use of native libraries like mkl, openblas or nvblas be exposed to the enduser. As these functions are already present in netlib (https://github.com/luhenry/netlib), it shouldn't be too much of work.

dlwh commented 9 months ago

Honestly, there's no reason other than I started out prioritizing Double and never bothered. Happy to take a PR...

dineshdharme commented 9 months ago

For my own purposes, I just created a new file and it works. But I am guessing PR would be mean a lot more than that.

Here's an example.

https://pastebin.com/VpFiPbba

package breeze.stats.regression

import breeze.generic.UFunc
import breeze.linalg._
import org.netlib.util.intW
import dev.ludovic.netlib.lapack.LAPACK.{getInstance => lapack}
import java.util.Arrays

private object leastSquaresSingleImplementation {
  def doLeastSquaresSingle(
      data: DenseMatrix[Float],
      outputs: DenseVector[Float],
      workArray: Array[Float]): LeastSquaresRegressionSingleResult = {
    require(data.rows == outputs.size)
    require(data.rows > data.cols + 1)
    require(workArray.length >= 2 * data.rows * data.cols)

    val info = new intW(0)
    lapack.sgels(
      "N",
      data.rows,
      data.cols,
      1,
      data.data,
      data.rows,
      outputs.data,
      data.rows,
      workArray,
      workArray.length,
      info)
    if (info.`val` < 0) {
      throw new ArithmeticException("Least squares did not converge.")
    }

    val coefficients = new DenseVector[Float](Arrays.copyOf(outputs.data, data.cols))
    var r2 = 0.0
    for (i <- 0 until (data.rows - data.cols)) {
      r2 = r2 + math.pow(outputs.data(data.cols + i), 2)
    }
    LeastSquaresRegressionSingleResult(coefficients, r2.toFloat)
  }
}

case class LeastSquaresRegressionSingleResult(coefficients: DenseVector[Float], rSquared: Float)
    extends RegressionResult[DenseVector[Float], Float] {
  def apply(x: DenseVector[Float]): Float = coefficients.dot(x)

  def apply(X: DenseMatrix[Float]): DenseVector[Float] = X * coefficients
}

object leastSquaresSingle extends UFunc {
  implicit val matrixVectorWithWorkArray
    : Impl3[DenseMatrix[Float], DenseVector[Float], Array[Float], LeastSquaresRegressionSingleResult] =
    new Impl3[DenseMatrix[Float], DenseVector[Float], Array[Float], LeastSquaresRegressionSingleResult] {
      def apply(
          data: DenseMatrix[Float],
          outputs: DenseVector[Float],
          workArray: Array[Float]): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(data.copy, outputs.copy, workArray)
    }

  implicit val matrixVectorSpecifiedWork
    : Impl3[DenseMatrix[Float], DenseVector[Float], Int, LeastSquaresRegressionSingleResult] =
    new Impl3[DenseMatrix[Float], DenseVector[Float], Int, LeastSquaresRegressionSingleResult] {
      def apply(data: DenseMatrix[Float], outputs: DenseVector[Float], workSize: Int): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(data.copy, outputs.copy, new Array[Float](workSize))
    }

  implicit val matrixVector: Impl2[DenseMatrix[Float], DenseVector[Float], LeastSquaresRegressionSingleResult] =
    new Impl2[DenseMatrix[Float], DenseVector[Float], LeastSquaresRegressionSingleResult] {
      def apply(data: DenseMatrix[Float], outputs: DenseVector[Float]): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(
          data.copy,
          outputs.copy,
          new Array[Float](math.max(1, data.rows * data.cols * 2)))
    }
}

object leastSquaresSingleDestructive extends UFunc {
  implicit val matrixVectorWithWorkArray
    : Impl3[DenseMatrix[Float], DenseVector[Float], Array[Float], LeastSquaresRegressionSingleResult] =
    new Impl3[DenseMatrix[Float], DenseVector[Float], Array[Float], LeastSquaresRegressionSingleResult] {
      def apply(
          data: DenseMatrix[Float],
          outputs: DenseVector[Float],
          workArray: Array[Float]): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(data, outputs, workArray)
    }

  implicit val matrixVectorSpecifiedWork
    : Impl3[DenseMatrix[Float], DenseVector[Float], Int, LeastSquaresRegressionSingleResult] =
    new Impl3[DenseMatrix[Float], DenseVector[Float], Int, LeastSquaresRegressionSingleResult] {
      def apply(data: DenseMatrix[Float], outputs: DenseVector[Float], workSize: Int): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(data, outputs, new Array[Float](workSize))
    }

  implicit val matrixVector: Impl2[DenseMatrix[Float], DenseVector[Float], LeastSquaresRegressionSingleResult] =
    new Impl2[DenseMatrix[Float], DenseVector[Float], LeastSquaresRegressionSingleResult] {
      def apply(data: DenseMatrix[Float], outputs: DenseVector[Float]): LeastSquaresRegressionSingleResult =
        leastSquaresSingleImplementation.doLeastSquaresSingle(
          data,
          outputs,
          new Array[Float](math.max(1, data.rows * data.cols * 2)))
    }
}
dineshdharme commented 8 months ago

I would like to work on supporting single precision. But as you have mentioned in other PRs it seems you are busy refractoring the codebase to migrate to scala3. Let me know if it is a good idea to work on it. Or should it wait until then?

dlwh commented 8 months ago

Thanks @dineshdharme! scala3 is already supported since the breeze 2.0 release. I'm not doing any refactoring at all.

To be honest, I'm not really going to do any work on breeze myself anymore, but I'm happy to review and merge a PR, answer questions, and even cut a release.

dineshdharme commented 8 months ago

Hello David, I created a PR for adding Float datatype support for leastSquares calculation.

https://github.com/scalanlp/breeze/pull/857

I tried my best. There could be inaccuracies in properly using Scala semantics for types and other issues. Please let me know. I will try fix those as well. This is my first PR for OSS project. Thank you..