Open brycelelbach opened 8 years ago
@brycelelbach @hkaiser Could you please add a project description here https://github.com/STEllAR-GROUP/hpx/wiki/GSoC-2017-Project-Ideas
I am interested in working on this project. I have seen that in the previous PRs we have added openMP pragmas for vectorization and parallelisation of a loop. Can someone guide me on how I can start out with working on this issue?
I am interested in working on this project. I have seen that in the previous PRs we have added openMP pragmas for vectorization and parallelisation of a loop. Can someone guide me on how I can start out with working on this issue?
Yes, we have implemented this for the first batch of algorithms. There are still algorithms left that have not been touched, though. Also, we would need a thorough performance analysis of the existing implementation, combined with improvements, if needed.
par_unseq implementation for algorithms, checking for all (work under progress)
adjacent_difference
inner_product
does it support any execution policy, could not find doc. Do we Implement using transform reduce?adjacent_find
all_of
any_of
none_of
copy
copy_if
copy_n
(copy uses memmove, copy_if has unseq)move
(uses memmove)count
count_if
equal
mismatch
(unable to trace bp in loop.hpp), likely does not support par_unseqexclusive_scan
inclusive_scan
reduce
transform
fill
fill_n
find
find_end
find_first_of
find_if
find_if_not
(yet to check)for_each
for_each_n
generate
generate_n
is_heap
is_heap_until
(falls back to seq or par)is_partitioned
is_sorted
is_sorted_until
lexicographical_compare
max_element
min_element
minmax_element
make_heap
partial_sort
(implemented using async)partial_sort_copy
nth_element
(implemented using async futures)sort
(parallel async implementation)stable_sort
partition
partition_copy
stable_partition
remove
remove_if
(conditional in loop body)remove_copy
remove_copy_if
(conditional in loop body)replace
replace_copy
replace_copy_if
replace_if
(conditional in loop body)reverse
reverse_copy
rotate
rotate_copy
search
search_n
(conditional is loop body, can not vectorize)set_difference
set_intersection
set_symmetric_difference
set_union
includes
inplace_merge
merge
swap_ranges
uninitialized_copy
uninitialized_copy_n
uninitialized_fill
uninitialized_fill_n
uninitialized_default_construct
uninitialized_default_construct_n
uninitialized_value_construct
uninitialized_value_construct_n
uninitialized_move
uninitialized_move_n
destroy
destroy_n
unique
unique_copy
transform_reduce
transform_exclusive_scan
transform_inclusive_scan
shift_left
shift_right
starts_with
ends_with
Hello @hkaiser , I am interest in this topic on gsoc24 ,I have a qeustion. Is this restricted to only use the #pragma omp simd to vectorize or using something like m128d, m256d, some SIMD instructions are unreadable.
Hello @hkaiser , I am interest in this topic on gsoc24 ,I have a qeustion. Is this restricted to only use the #pragma omp simd to vectorize or using something like m128d, m256d, some SIMD instructions are unreadable.
Everything is possible, I guess - as long as it is portable across architectures (beyond x86), at least in the long run.
The
par_vec
(akapar_unseq
) policy allows interleaving of element access functions, e.g. it is safe to the iterations of the algorithm.Explicit engagement of compiler vectorizers through pragmas is probably the best way to ensure this occurs (e.g.
#pragma simd
,#pragma omp simd
).I will probably take a look into doing this myself while preparing my CppCon talk on parallel algorithms.