Closed ehsantn closed 8 years ago
squeeze
is not a supported function for Domain IR at the moment. CGen translation fails cause squeeze
has strings and calls throw
. Squeeze is essentially a reshape
operation, and CGen has trouble translating that too.
I think reshape
is common enough and should be handled by CGen, and implemented as a J2C array function.
The following code now works (I've verified its correctness against the program above):
@acc begin
function multmv(a::Array{Float64,2}, b::Array{Float64,1})
Float64[ sum(reshape(a[i,:], size(a,2)) .* b) for i in 1:size(a,1) ]
end
function multvm(a::Array{Float64,1}, b::Array{Float64,2})
Float64[ sum(a .* b[:,i]) for i in 1:size(b,2) ]
end
function main(iterations::Int64)
D = 10 # Number of dimensions
N = 100
w = 2.0.*rand(D)-1.0
labels = rand(N)
points = rand(N,D)
for i in 1:iterations
w -= multvm((1.0./(1.0.+exp(-labels.*multmv(points,w))).-1.0).*labels,points)
end
w
end
end
However, when I add @inline
to mutlmv
and multvm
, it would fail to compile. This appears to be a Julia inline issue, i.e., despite that @inline
is added to the two functions, they are not inlined into main
. Instead, a number of calls inside multmv
and multvm
are inlined (e.g., ParallelAccelerator.API.sum) despite we have explicitly marked them with @noinline
.
I rewrote the example since the previous implementation had some issues. The code is below. I will try to replace GEMM calls to see what happens to the reductions throughout the pipeline.
using ParallelAccelerator
iter = 15
@acc function main(iterations::Int64)
D = 3 # Number of dimensions
N = 10
labels = reshape(rand(N),1,N)
points = rand(D,N)
w = reshape(2.0.*rand(D)-1.0,1,D)
for i in 1:iterations
w -= ((1.0./(1.0.+exp(-labels.*(w*points))).-1.0).*labels)*points'
end
w
end
W = main(iter)
println(W)
Multiple issues with ParallelAccelerator:
With the recent ParallelIR optimizations I checked in, the allocation hoisting and transpose+gemm issues are resolved. The first gemm could be fused with the array operations but this is not a top priority right now.
This program for logistic regression fails. CGen gives the error below but I suspect there are many more issues we need to fix for proper compilation and parallelization.