koheiw / proxyC

R package for large-scale similarity/distance computation
GNU General Public License v3.0
29 stars 6 forks source link

Add inner-product? #54

Open koheiw opened 4 months ago

koheiw commented 4 months ago

I wonder if we can help people computing inner-products by adding "product". For example

https://stackoverflow.com/questions/40228592/fastest-way-to-compute-row-wise-dot-products-between-two-skinny-tall-matrices-in

On the add-product branch,

r <- 10^3
c <- 10^4
A <- Matrix::rsparsematrix(r, c, 0.5)
B <- Matrix::rsparsematrix(r, c, 0.5)

out1 <- proxyC:::proxy(A, B, method = "product", sparse = FALSE, min_proxy = -Inf)
out2 <- Matrix::tcrossprod(A, B)

identical(as.matrix(out1), as.matrix(out2))
#> [1] FALSE
all(abs(out1 - out2) < 1e-10)
#> [1] TRUE

microbenchmark::microbenchmark(
    proxyC = proxyC:::proxy(A, B, method = "product", sparse = FALSE, min_proxy = -Inf),
    Matrix = Matrix::tcrossprod(A, B),
    times = 10
)
#> Unit: seconds
#>    expr      min       lq     mean   median       uq      max neval
#>  proxyC 2.444084 2.585879 2.749906 2.699738 2.862626 3.111521    10
#>  Matrix 3.938076 4.255705 4.371230 4.417503 4.530834 4.891848    10

We can also make proxyC::crossprod() if it is more intuitive.