yanboliang / spark-vlbfgs

Vector-free L-BFGS implementation for Spark MLlib
Apache License 2.0
46 stars 17 forks source link

[spark-vflbfgs] add blockMatrixHorzZipVec and blockMatrixVertZipVec helper methods (prepared for Vector-free LogisticRegression implementation) #2

Closed WeichenXu123 closed 7 years ago

WeichenXu123 commented 7 years ago

What changes were proposed in this pull request?

1. add two VFUtils helper methods blockMatrixHorzZipVec and blockMatrixVertZipVec

  def blockMatrixHorzZipVec[T: ClassTag](
      blockMatrixRDD: RDD[((Int, Int), SparseMatrix)],
      dvec: DistributedVector,
      gridPartitioner: GridPartitionerV2,
      f: ((SparseMatrix, Vector) => T)

this method do something similar to a Matrix multiplying col-Vector operation. the input blockMatrixRDD must have the same cols with the input dvec and the gridPartitioner parameter decide how the blockMatrixRDD is partitioned. the f function defined what operator will be executed between the joined blockMatrix element and the coresponding dvec partition.

  def blockMatrixVertZipVec[T: ClassTag](
      blockMatrixRDD: RDD[((Int, Int), SparseMatrix)],
      dvec: DistributedVector,
      gridPartitioner: GridPartitionerV2,
      f: (((SparseMatrix, Vector) => T))
  )

similar to blockMatrixHorzZipVec, but it is like a row-Vector multiplying Matrix operation.

2. remove the CustomCoalescer class. (Useless code.)

How was this patch tested?

Testcases for blockMatrixHorzZipVec and blockMatrixVertZipVec added. Other VFUtils helper functions testcases also added.