JuliaLang / Distributed.jl

Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
MIT License
20 stars 8 forks source link

Parallel multiple for loop #21

Open bdeonovic opened 9 years ago

bdeonovic commented 9 years ago

I wanted to run a parallel nested for loop. In the docs it shows how you can do a nested for loop into a single outer loop:

for i = 1:2, j = 3:4
  println((i, j))

and an example for parallel for loop:

nheads = @parallel (+) for i=1:200000000

How can I combine these concepts? I tried something along the lines of:

result = @parallel (hcat) for i=1:10,j=1:10

but I get:

ERROR: syntax: invalid assignment location

Also tried with Iterators package:

julia> result = @parallel (hcat) for p in product(0:0.1:1,0:0.1:1)
         [p[1]^2, p[2]^2]
exception on 1: ERROR: MethodError: `getindex` has no method matching getindex(::Iterators.Product, ::Int64)
Closest candidates are:
  getindex(::(Any...,), ::Int64)
  getindex(::(Any...,), ::Real)
  getindex(::FloatRange{T}, ::Integer)

 in anonymous at no file:1679
 in anonymous at multi.jl:1528
 in run_work_thunk at multi.jl:603
 in run_work_thunk at multi.jl:612
 in anonymous at task.jl:6

I am on:

julia> versioninfo()
Julia Version 0.4.0-dev+2684
Commit 8938e3a (2015-01-13 22:01 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
  LAPACK: liblapack.so.3
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
ViralBShah commented 9 years ago

This would certainly be nice to have, and shouldn't be too difficult. We could start by just parallelizing the outermost loop, but in cases where there isn't enough parallelism there, we should also be able to parallelise across loop nest levels.

Cc: @amitmurthy

timholy commented 9 years ago

See the recent work in JuliaLang/julia#9871 for something that's pretty close to what you need.

kmsquire commented 9 years ago

Relevent (but not sure how useful), OpenMP has a collapse keyword for, well, collapsing nested parallel loops.

ViralBShah commented 9 years ago

@timholy I missed JuliaLang/julia#9871. Thanks!

bdeonovic commented 9 years ago

I am a bit confused on what @simd actually does. Can it be used to do what I want in my original post?

ArchRobison commented 9 years ago

@simd is for exploiting vector arithmetic units, not multiple threads. It's currently limited to a single loop. Though with PR JuliaLang/julia#9876 you can use it on Cartesian ranges, albeit it just vectorizes the fastest index instead of collapsing/flattening the range.

On modern processors, best results are often obtained by multi-threading the outer loop(s) and vectorizing the innermost loop.

nilshg commented 9 years ago

Any news on this? It appears that @parallel still doesn't accept one line nested loops, or is there a way to achieve this by now?

timholy commented 9 years ago

You can do this manually today. Just do it over the outer loop only, or use ind2sub to convert between an overall linear index and the multidimensional/cartesian index you want.

nilshg commented 9 years ago

Sorry, but I'm not sure I understand your suggestion. Would your first suggestion be to convert

@parallel for i=1:10, j=1:10, k=1:10


@parallel for i=1:10
    for j=1:10, k=1:10

And I'm afraid that after reading the relevant entry in the docs I have no idea what ind2sub is doing exactly.

pao commented 9 years ago

@timholy Would eachindex also be a reasonable alternative?

@nilshg The ind2sub approach:

@parallel for i in 1:length(A)
    indices = ind2sub(size(A), i)
    @show A[indices...]

(Since I'm using @show this will be a mess, but I hope it gets the point across.)

tlnagy commented 7 years ago

Any movement on this? I just ran into this today and it would be very nice to have this feature.

rkumar-slim commented 5 years ago

I am looking into the same problem. I have 2 for nested loops and I want to parallelize first for loop over workers and second loop over threads? Is this possible.

cossio commented 5 years ago

A possible workaround:

@parallel for (i,j) in collect(Iterators.product(1:10, 1:20))
TAJD commented 5 years ago

I think the example in the docs on using shared arrays in parallel computing gives some demonstrations on how to go about parallizing multiple iterators.

cossio commented 5 years ago

@TAJD where exactly?

TAJD commented 5 years ago

I think I misunderstood the problem. But I found the notes on using SharedArrays helpful for parallizing my own simulations.

cossio commented 5 years ago

Maybe the workaround I suggested earlier:

@distributed for (i,j) in collect(Iterators.product(1:10, 1:20))

could be the default behavior of @distributed when it encounters a multi-dimensional loop? Unfortunately I don't know enough about macros as to prepare a PR myself with this approach.

Also it is not clear what the reducer would do in this case.