scalanlp / breeze

Breeze is/was a numerical processing library for Scala.
https://www.scalanlp.org
Apache License 2.0
3.45k stars 693 forks source link

Auto boxing inside foreach DenseVector[Double] when call norm function #771

Closed kovrizhin closed 4 years ago

kovrizhin commented 4 years ago

When we call norm function for DenseVector we always have boxing/unboxing stuff.

I am not pro in scala but looks like it is because of foreach use generic approach: override def foreach[@spec(Unit) U](fn: (V) => U)

Also the second problem would be: val nn = f.sNorm(v); sum += nn * nn looks like here it will be also convert to java.lang.Double

May be it is possible to simple rewrite it as normal for-loop and don't use implicit resolving sNorm function.

I spent a bit of time but not sure how to do it correctly and in generic way, and this is why I decide to write here may be you can help.

Thanks in advance

dlwh commented 4 years ago

it does seem to be boxing. That's infuriating. I'll use a manual loop

dlwh commented 4 years ago

ok fixed in master

kovrizhin commented 4 years ago

Thanks a lot, for a quick fix. I have a look into changes, looks cool.

But I have 2 question:

1) DenseVectorOps also has some method, which mostly calls and make a bigger problem:

  implicit def canNorm[T: Field]: norm.Impl[DenseVector[T], Double] = {
    val f = implicitly[Field[T]]
    new norm.Impl[DenseVector[T], Double] {
      override def apply(v: DenseVector[T]): Double = {
        import v._
        var sum = 0.0
        foreach(v => { val nn = f.sNorm(v); sum += nn * nn })
        math.sqrt(sum)
      }
    }
  }

This function is called from LBFGS:

  protected def determineStepSize(state: State, f: DiffFunction[T], dir: T) = {
    val x = state.x
    val grad = state.grad

    val ff = LineSearch.functionFromSearchDirection(f, x, dir)
    val search = new StrongWolfeLineSearch(maxZoomIter = 10, maxLineSearchIter = 10) // TODO: Need good default values here.
    val alpha = search.minimize(ff, if (state.iter == 0.0) 1.0 / norm(dir) else 1.0)

    if (alpha * norm(grad) < 1E-10)
      throw new StepSizeUnderflow
    alpha
  }
norm(dir)
norm(grad)

2) Changes from v.foreach -> to simple cforRange(0 until v.length) can be a bug, because foreach used stride, offset, what do you think?

 override def foreach[@spec(Unit) U](fn: (V) => U): Unit = {
    if (stride == 1) { // ABCE stuff
      cforRange(offset until (offset + length)) { j =>
        fn(data(j))
      }
    } else {
      var i = offset
      cforRange(0 until length) { j =>
        fn(data(i))
        i += stride
      }
    }
  }
dlwh commented 4 years ago

(1)'s a good point and I'll have to fix.

(2) it's only an issue if you go through the data array. If you use apply on DenseVector (which I do in norm) it's handled there. It's a bit slower to not hit data directly, but it's more of a pain to write.