microsoft / microsoft-r-open

Microsoft R Open Source
212 stars 69 forks source link

Don't know what R will do when using crossproduct(a,b) with dim(a)=c(3,1) and dim(b)=c(3,large) #84

Closed Neutron3529 closed 5 years ago

Neutron3529 commented 5 years ago

I'm sorry I don't know how to switch the default chinese output into English version... Execute system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000))) in CRAN R 3.5.1:

> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
0.72 0.00 0.72 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
0.72 0.00 0.72 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
0.74 0.00 0.73 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
0.72 0.00 0.72 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
0.74 0.00 0.74 

elapsed time is average 0.73, user is at average 0.73, too. but when using R Open, things becomes quite hard:

> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
3.70 0.33 0.67 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
3.78 0.34 0.69 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
3.60 0.25 0.64 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
3.49 0.36 0.64 
> system.time(for(i in 1:10000)crossprod(matrix(1,3,1),matrix(1,3,10000)))
用户 系统 流逝 
3.67 0.45 0.69 

it shows that R open is a little bit faster while considering elapsed time(average 0.67), but the CPU time is larger than what we expect.

I just wonder what happened.

jeroenterheerdt commented 5 years ago

Quoting @richcalaway:

these numbers show indicate MRO is working correctly, and actually employing the MKL library to parallelize the cross-product computation. The cpu time is showing what all the threads are doing, so is larger than any one thread (such as the unparallelized CRAN R computation).

Neutron3529 commented 5 years ago

Won't these two CPU time be the same? why the paralleized version so slow? and where are actually the CPU time spend to? and, I once believe that it is the paralleize make the CPU time so large, but when I increase the size of the matrix, nothing happened, the speed of the calculation (elasped) are still almost the same:

> a=matrix(1,3,1);b=matrix(1,3,10000000);system.time(for(i in 1:100)crossprod(a,b))
用户 系统 流逝 
5.48 1.25 6.73
> a=matrix(1,3,1);b=matrix(1,3,10000000);system.time(for(i in 1:100)crossprod(a,b))
 用户  系统  流逝 
28.55  5.51  5.70 

I strongly believe that even if I wrote a crossprod function by C with #pragma omp parallel for, it would get a lot of benefit, takes less time (both user and elasped) to get the expected result. I just want to know why R Open so slow dealing with such matrix