Dobiasd / FunctionalPlus

Functional Programming Library for C++. Write concise and readable C++ code.
http://www.editgym.com/fplus-api-search/
Boost Software License 1.0
2.07k stars 168 forks source link

parallel_for ? (no, this is not the 1st of April ;-) #260

Closed pthom closed 2 years ago

pthom commented 2 years ago

Yes, this is controversial; since it goes against any kind of functional paradigm ;-)

However, sometimes I would like to run a "for loop" in parallel.

Take this for example:

int main()
{
    // Motivational example, let's say we have a matrix whose rows we want to fill in parallel

    int rows = 1000, cols = 1500;
    // cv::Mat mandelbrot_image = ...

    auto fill_mandelbrot_row = [/* & mandelbrot_image */](size_t row_idx)
    {
        // ... Fill the elements in mandelbrot_image on this row
    };

    std::vector<int> numbers_squared;

    fplus::parallel_for(fill_mandelbrot_row, 0, rows);
}

In that case, we could use this draft:

namespace fplus
{

    // F is of type Int -> Io()
    template<typename F, typename Number>
    void parallel_for(F f, Number lo, Number hi) 
    {
        auto range = fplus::numbers(lo, hi);

        auto f_dummy_return = [&f](Number v) 
        {
            f(v);
            return true;
        };

        fplus::transform_parallelly(f_dummy_return, range);
    }

    // F is of type Int -> Io()
    template<typename F, typename Number>
    void parallel_for_n_threads(int nb_threads, F f, Number lo, Number hi) 
    {
        auto range = fplus::numbers(lo, hi);

        auto f_dummy_return = [&f](Number v) 
        {
            f(v);
            return true;
        };

        fplus::transform_parallelly_n_threads(nb_threads, f_dummy_return, range);
    }

}

Demo on compiler explorer

Dobiasd commented 2 years ago

Interesting! :slightly_smiling_face:

Yeah, I must admit, for some tests, I also already once misused transform_parallelly to parallelly execute some side effects. :grimacing:

I understand, that the f_dummy_return is there because transform_parallelly does not work with functions returning nothing (void). :white_check_mark: Have you already checked that this indirection can be optimized away by the compiler (for performance)?

Consistancy-wise (looking at the parameters of the existing transform_parallelly_n_threads), I think nb_threads should be std::size_t instead of int. :office_worker:

Instead of restricting the new parallel_for to use numbers, we could also allow the user to pass arbitrary containers to iterate over so that they can use fplus::numbers if that's what they want, like so:

template<typename F, typename Container>
void parallel_for(F f, const Container& xs)

What do you think?

pthom commented 2 years ago

Yeah, I must admit, for some tests, I also already once misused transform_parallelly to parallelly execute some side effects. 😬

Same here!

I understand, that the f_dummy_return is there because transform_parallelly does not work with functions returning nothing (void). ✅ Have you already checked that this indirection can be optimized away by the compiler (for performance)?

If I remove the return true, the compiler throws at me a long piece of poetry:

usr/include/c++/v1/__tree:616:38: error: cannot form a reference to 'void'
  static pair<key_type&&, mapped_type&&> __move(__node_value_type& __v) {
                                     ^
usr/include/c++/v1/__tree:665:14: note: in instantiation of template class 'std::__tree_key_value_types<std::__value_type<unsigned long, void>>' requested here
             __tree_key_value_types<_Tp>,
             ^
usr/include/c++/v1/__tree:988:22: note: in instantiation of template class 'std::__tree_node_types<std::__tree_node<std::__value_type<unsigned long, void>, void *> *, std::__tree_node<std::__value_type<unsigned long, void>, void *>>' requested here
    typedef typename _NodeTypes::key_type           key_type;
                     ^
usr/include/c++/v1/map:936:22: note: in instantiation of template class 'std::__tree<std::__value_type<unsigned long, void>, std::__map_value_compare<unsigned long, std::__value_type<unsigned long, void>, std::less<unsigned long>, true>, std::allocator<std::__value_type<unsigned long, void>>>' requested here
    typedef typename __base::__node_traits                 __node_traits;
                     ^
FunctionalPlus/include/fplus/transform.hpp:352:44: note: in instantiation of template class 'std::map<unsigned long, void>' requested here
    std::map<std::size_t, std::decay_t<Y>> thread_results;
                                           ^
FunctionalPlus/include/fplus/side_effects.hpp:323:12: note: in instantiation of function template specialization 'fplus::transform_parallelly_n_threads<(lambda at FunctionalPlus/include/fplus/side_effects.hpp:319:27), std::vector<int>>' requested here
    fplus::transform_parallelly_n_threads(n_threads, f_dummy_return, idxs_to_iterate);
           ^
FunctionalPlus/test/side_effects_test.cpp:121:16: note: in instantiation of function template specialization 'fplus::parallel_for_n_threads<(lambda at FunctionalPlus/test/side_effects_test.cpp:94:32), std::vector<int>>' requested here
        fplus::parallel_for_n_threads(4, fill_mandelbrot_row, rows_idxs);
               ^
In file included from FunctionalPlus/test/side_effects_test.cpp:7:
In file included from FunctionalPlus/cmake-build-debug/doctest_install/include/doctest/doctest.h:2782:
usr/include/c++/v1/map:1131:16: error: cannot form a reference to 'void'
    mapped_type& operator[](const key_type& __k);
               ^
FunctionalPlus/include/fplus/transform.hpp:352:44: note: in instantiation of template class 'std::map<unsigned long, void>' requested here
    std::map<std::size_t, std::decay_t<Y>> thread_results;
                                           ^
FunctionalPlus/include/fplus/side_effects.hpp:323:12: note: in instantiation of function template specialization 'fplus::transform_parallelly_n_threads<(lambda at FunctionalPlus/include/fplus/side_effects.hpp:319:27), std::vector<int>>' requested here
    fplus::transform_parallelly_n_threads(n_threads, f_dummy_return, idxs_to_iterate);
           ^
FunctionalPlus/test/side_effects_test.cpp:121:16: note: in instantiation of function template specialization 'fplus::parallel_for_n_threads<(lambda at FunctionalPlus/test/side_effects_test.cpp:94:32), std::vector<int>>' requested here
        fplus::parallel_for_n_threads(4, fill_mandelbrot_row, rows_idxs);
               ^
In file included from FunctionalPlus/test/side_effects_test.cpp:7:
In file included from FunctionalPlus/cmake-build-debug/doctest_install/include/doctest/doctest.h:2782:
usr/include/c++/v1/map:1133:16: error: cannot form a reference to 'void'
    mapped_type& operator[](key_type&& __k);
               ^
usr/include/c++/v1/map:1136:22: error: cannot form a reference to 'void'
          mapped_type& at(const key_type& __k);
                     ^
usr/include/c++/v1/map:1137:22: error: cannot form a reference to 'void'
    const mapped_type& at(const key_type& __k) const;
                     ^

Consistancy-wise (looking at the parameters of the existing transform_parallelly_n_threads), I think nb_threads should be std::size_t instead of int. 🧑‍💼

Absolutely

Instead of restricting the new parallel_for to use numbers, we could also allow the user to pass arbitrary container

Even better!

I'll post a PR

Cheers

Dobiasd commented 2 years ago

Thanks for the good contribution. :+1: It's always nice to see FunctionalPlus in action. :heart_eyes: