IntelLabs / ParallelAccelerator.jl

The ParallelAccelerator package, part of the High Performance Scripting project at Intel Labs
BSD 2-Clause "Simplified" License
294 stars 32 forks source link

Advanced Fusion for K-Means #88

Open ehsantn opened 8 years ago

ehsantn commented 8 years ago

The k-means example is a good use case for advanced fusion. We made the first comprehension nested in our example to enable fusion but ideally, the computation should be written as follows:

points = rand(D,N)
dist = Float64[sqrt(sum((points[:,i].-centroids[:,j]).^2)) for j in 1:numCenter, i in 1:N]
labels = Int[indmin(dist[i]) for i in 1:N]
centroids = Float64[ sum(points[j,labels.==i])/sum(labels.==i) for j in 1:D, i in 1:numCenter]

The outer loop of the first comprehension and the second comprehension can be fused but it requires fusion of parfors with different number of loops. The inner loops of the third comprehension can also be fused, but it requires loop interchange beforehand.