Closed masterholdy closed 4 years ago
Use basesize = 1
, if each call to LangfordParallel.loop_inner
takes a long time (> 100 micro seconds, say).
Also, I recommend you to read the documentation and understand how Iterators.partition
work. You are using Iterators.partition
(and Partition
) wrongly. If the documentation of Iterators.partition
is not clear, I encourage you to ask a question in http://discourse.julialang.org (or other Julia community websites https://julialang.org/community/).
As I said, you don't need partition here. I also recommend you to read the documentation and understand how the function works before using it.
You are running reduce
on a vector with single element. There is nothing to be parallelized for such input.
julia> possibilites = 3
3
julia> ys = Iterators.partition(0:possibilites, 1) |> Map(range -> @show(range)) |> collect;
range = 0:0
range = 1:1
range = 2:2
range = 3:3
julia> ys
4-element Vector{UnitRange{Int64}}:
0:0
1:1
2:2
3:3
julia> map(length, ys)
4-element Vector{Int64}:
1
1
1
1
Did you try something like
ThreadsX.sum(
idx -> loop_inner_parallel(parentValue, value, idx, depth, s, n, sn),
0:possibilites;
basesize = 1,
init = 0,
)
? Please do note that this requires each call to loop_inner_parallel
takes sufficiently long time (e.g., > 100 micro seconds).
basesize=8
is the correct way to specify number of items per task.FYI, you can't use
Transducers.Partition
withreduce
. But you can useIterators.parallel
:There are various reasons why you don't get performance boosts when using threading in Julia. Note that you have to optimize
LangfordParallel.loop_inner
first. From a quick look, the inner loop uses something likesum([loop_inner(parentValue, value, i, depth, s, n, sn) for i=0:possibilites])
andzeros(sn)
to allocate arrays. These are not performance-friendly patterns even for single-threaded code. For example, thesum
can be written assum(loop_inner(parentValue, value, i, depth, s, n, sn) for i=0:possibilites)
orsum(i -> loop_inner(parentValue, value, i, depth, s, n, sn), 0:possibilites)
to completely avoid allocating an array.