darrenjw / scala-glm

Scala library for fitting linear and generalised linear statistical models
Apache License 2.0
28 stars 4 forks source link

Lm crashes with invalid parameters passed to blas.dtrsv #4

Closed philwalk closed 2 weeks ago

philwalk commented 2 weeks ago

Thanks for a really cool library!

The following code throws an IllegalArgumentException:

import breeze.linalg.*
import scalaglm.*

object LmBug {
  def main(args: Array[String]): Unit = {
    val X = new DenseMatrix[Double](2, 2, Array(1.0, 2.0, -1.0, 0.1))
    val y = X(0, ::).t // row zero as a column vector
    val model = Lm(y, X)
    printf("%s\n", model)
  }
}

In scalaglm.Lm the comment adjacent to the QR.r suggests that it's a p x p upper-triangular matrix. In this case, it's 2 x 3 matrix that looks something like this:

[[ -1.4,  -2.1,  0.6], 
 [0.0,  0.707,  0.77]]

Then Lm calls scalaglm.Utils.backSolve. This leads to a call to blas.dtrsv, which fails because it requires parameter 6 (A.rows == 2) to be <= parameter 4 (A.cols==3).

A p x p matrix would meet this requirement, so maybe the problem is a side-effect of calling qr.reduced, described here as returning matrices Q and R with dimensions (m, k), (k, n) with k = min(m, n)..

I'm happy to offer a PR once I understand what ought to happen here.

Here's the stackdump:

java.lang.IllegalArgumentException: ** On entry to 'DTRSV' parameter number 6 had an illegal value
        at dev.ludovic.netlib.blas.AbstractBLAS.checkArgument(AbstractBLAS.java:44)
        at dev.ludovic.netlib.blas.AbstractBLAS.dtrsv(AbstractBLAS.java:1596)
        at dev.ludovic.netlib.blas.AbstractBLAS.dtrsv(AbstractBLAS.java:1587)
        at scalaglm.Utils$.backSolve(Utils.scala:32)
        at scalaglm.Lm.(Lm.scala:76)
        at scalaglm.Lm$.apply(Lm.scala:32)
        at scalaglm.Lm$.apply(Lm.scala:266)
        at scalaglm.Lm$.apply(Lm.scala:274)
        at LmBug$.main(lmbug.sc:10)
        at LmBug.main(lmbug.sc)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at dotty.tools.scripting.ScriptingDriver.compileAndRun(ScriptingDriver.scala:33)
        at dotty.tools.scripting.Main$.process(Main.scala:45)
        at dotty.tools.MainGenericRunner$.run$1(MainGenericRunner.scala:250)
        at dotty.tools.MainGenericRunner$.process(MainGenericRunner.scala:270)
        at dotty.tools.MainGenericRunner$.main(MainGenericRunner.scala:281)
        at dotty.tools.MainGenericRunner.main(MainGenericRunner.scala)
darrenjw commented 2 weeks ago

Thanks for flagging this. I agree it shouldn't crash and burn so ungracefully, but the library implicitly assumes more observations than covariates. ie. that the problem is over-constrained. This is the typical use-case for linear regression. In your example, the problem is under-constrained (because there is actually a third covariate corresponding to the intercept, so you actually have more parameters than observations).

philwalk commented 2 weeks ago

Ahh, that makes sense! I've been deep diving into the 3-pass regression filter, which has no such constraint. Should I close this, or should I offer a PR with a better error message? Perhaps this constraint should be applied to X rather than Xmat ?

darrenjw commented 2 weeks ago

Yes, exactly. I will just add the extra constraint to X and then close this when I've done it. Thanks.

philwalk commented 2 weeks ago

Ok, thanks for responding. The failure is much easier to understand with an updated constraint:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
    at scala.Predef$.require(Predef.scala:324)
    at scalaglm.Lm.<init>(Lm.scala:43)
    at scalaglm.Lm$.apply(Lm.scala:32)
    at scalaglm.Lm$.apply(Lm.scala:266)
    at scalaglm.Lm$.apply(Lm.scala:274)
    at vast.apps.ABug$.main(ABug.scala:11)
    at vast.apps.ABug.main(ABug.scala)
darrenjw commented 2 weeks ago

Constraint added and snapshot published. Closing now. Thanks.