IntelLabs / ParallelAccelerator.jl

The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs
BSD 2-Clause "Simplified" License
294 stars 32 forks source link

Logistic regression error #41

Closed ehsantn closed 8 years ago

ehsantn commented 8 years ago

This program for logistic regression fails. CGen gives the error below but I suspect there are many more issues we need to fix for proper compilation and parallelization.

using ParallelAccelerator
iter = 15

@acc function main(iterations::Int64)
    D = 10  # Number of dimensions
    N = 100
    w::Array{Float64,1} = 2.0.*rand(D)-1.0
    labels = rand(N)
    points = rand(N,D)
    for i in 1:iterations
       w -= squeeze(((1.0./(1.0.+exp(-labels.*(points*w))).-1.0).*labels)'*points,1)
    end
    w
end

W = main(iter)
println(W)
ERROR: LoadError: AssertionError: CGen: Strings are not supported
 in from_lambda at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:451
 in from_expr at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2121
 in from_root at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2502
 in from_worklist at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2686
 in from_root at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2599
 in from_worklist at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2686
 in from_root at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2599
 in from_root at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/cgen.jl:2469
 in toCGen at /home/etotoni/.julia/v0.4/ParallelAccelerator/src/driver.jl:178
 in processFuncCall at /home/etotoni/.julia/v0.4/CompilerTools/src/OptFramework.jl:338
 in main at /home/etotoni/.julia/v0.4/CompilerTools/src/OptFramework.jl:400
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:304
 in process_options at ./client.jl:280
 in _start at ./client.jl:378
ninegua commented 8 years ago

squeeze is not a supported function for Domain IR at the moment. CGen translation fails cause squeeze has strings and calls throw. Squeeze is essentially a reshape operation, and CGen has trouble translating that too.

I think reshape is common enough and should be handled by CGen, and implemented as a J2C array function.

ninegua commented 8 years ago

The following code now works (I've verified its correctness against the program above):


@acc begin

function multmv(a::Array{Float64,2}, b::Array{Float64,1})
   Float64[ sum(reshape(a[i,:], size(a,2)) .* b) for i in 1:size(a,1) ]
end

function multvm(a::Array{Float64,1}, b::Array{Float64,2})
   Float64[ sum(a .* b[:,i]) for i in 1:size(b,2) ]
end

function main(iterations::Int64)
    D = 10  # Number of dimensions
    N = 100
    w = 2.0.*rand(D)-1.0
    labels = rand(N)
    points = rand(N,D)
    for i in 1:iterations
       w -= multvm((1.0./(1.0.+exp(-labels.*multmv(points,w))).-1.0).*labels,points)
    end
    w
end

end

However, when I add @inline to mutlmv and multvm, it would fail to compile. This appears to be a Julia inline issue, i.e., despite that @inline is added to the two functions, they are not inlined into main. Instead, a number of calls inside multmv and multvm are inlined (e.g., ParallelAccelerator.API.sum) despite we have explicitly marked them with @noinline.

ehsantn commented 8 years ago

I rewrote the example since the previous implementation had some issues. The code is below. I will try to replace GEMM calls to see what happens to the reductions throughout the pipeline.

using ParallelAccelerator

iter = 15

@acc function main(iterations::Int64)
    D = 3  # Number of dimensions
    N = 10

    labels = reshape(rand(N),1,N)
    points = rand(D,N)
    w = reshape(2.0.*rand(D)-1.0,1,D)

    for i in 1:iterations
       w -= ((1.0./(1.0.+exp(-labels.*(w*points))).-1.0).*labels)*points'
    end
    w
end

W = main(iter)
println(W)
ehsantn commented 8 years ago

Multiple issues with ParallelAccelerator:

ehsantn commented 8 years ago

With the recent ParallelIR optimizations I checked in, the allocation hoisting and transpose+gemm issues are resolved. The first gemm could be fused with the array operations but this is not a top priority right now.