IntelLabs / ParallelAccelerator.jl

The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs
BSD 2-Clause "Simplified" License
294 stars 32 forks source link

RWS wrong for HPAT kmeans #120

Closed ehsantn closed 7 years ago

ehsantn commented 7 years ago

RWS is wrong for HPAT kmeans example. The first parfor inside the first big parfor is below. instead of _57, the index is _38 in RWS. Follow up on #116.

PIR Body: 
    _52 = (Base.unsafe_arrayref)(_5::Array{Float64,2},_57::Int64,_42)::Float64
    _56 = (Base.unsafe_arrayref)(_9::Array{Float64,2},_57::Int64,_48)::Float64
    SSAValue(21) = (Core.Intrinsics.sub_float)(_52,_56)::Float64
    SSAValue(22) = (Core.Intrinsics.box)(Float64,SSAValue(21))::Float64
    _59 = SSAValue(22)
    _60 = _59::Float64
    SSAValue(23) = (Core.Intrinsics.checked_trunc_sint)(Int32,2)::Int32
    SSAValue(24) = (Core.Intrinsics.box)(Int32,SSAValue(23))::Int32
    SSAValue(25) = (Core.Intrinsics.powi_llvm)(_60,SSAValue(24))::Float64
    SSAValue(26) = (Core.Intrinsics.box)(Float64,SSAValue(25))::Float64
    _61 = SSAValue(26)
    _62 = _61::Float64
    SSAValue(27) = (Core.Intrinsics.add_float)(_63,_62)::Float64
    SSAValue(28) = (Core.Intrinsics.box)(Float64,SSAValue(27))::Float64
    _63 = SSAValue(28)
Loop Nests: 
    ParallelAccelerator.ParallelIR.PIRLoopNest(:(_57::Int64),1,:(_58::Int64),1)
Reductions: 
    ParallelAccelerator.ParallelIR.PIRReduction(:(_63::Float64),0.0,ParallelAccelerator.ParallelIR.DelayedFunc(ParallelAccelerator.ParallelIR.reductionReplaceDict,Any[Any[:(SSAValue(27) = (Core.Intrinsics.add_float)(_63,_62)::Float64),:(SSAValue(28) = (Core.Intrinsics.box)(Float64,SSAValue(27))::Float64),:(_63 = SSAValue(28))],:(_63),:(_62)]))
Poststatements: 
    0    
))
DrTodd13 commented 7 years ago

I fixed this by not keeping rws around in the parfor but re-creating as needed. It is too complex trying to keep all this state in-sync.