tcbrindle / flux

A C++20 library for sequence-orientated programming
https://tristanbrindle.com/flux/
Boost Software License 1.0
472 stars 29 forks source link

Update the internal iteration protocol #100

Closed tcbrindle closed 3 weeks ago

tcbrindle commented 1 year ago

This is an experimental PR to try out the ideas in #99, namely replacing the existing for_each_while(seq, pred) -> cursor_t customisation point with a new pair of customisation points:

auto iterate_while(auto& seq, auto pred, cursor_t from) -> cursor_t;
auto iterate_while(auto& seq, auto pred, cursor_t from, cursor_t to) -> cursor_t;

The idea being that this will allow us to use internal iteration in more places.

A rough plan of action:

tcbrindle commented 1 year ago

@brycelelbach Don't worry, I'm not going to change anything out from under you: even if I get this all done and it shows an improvement, I'll wait till you've finished your work on #92 and then update cartesian_product to use the new protocol if necessary

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.03% :warning:

Comparison is base (0a0abbb) 97.67% compared to head (9901983) 97.65%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #100 +/- ## ========================================== - Coverage 97.67% 97.65% -0.03% ========================================== Files 66 66 Lines 2236 2172 -64 ========================================== - Hits 2184 2121 -63 + Misses 52 51 -1 ``` | [Files Changed](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle) | Coverage Δ | | |---|---|---| | [include/flux/op/filter.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2ZpbHRlci5ocHA=) | `100.00% <ø> (ø)` | | | [include/flux/core/default\_impls.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L2NvcmUvZGVmYXVsdF9pbXBscy5ocHA=) | `100.00% <100.00%> (ø)` | | | [include/flux/core/sequence\_access.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L2NvcmUvc2VxdWVuY2VfYWNjZXNzLmhwcA==) | `100.00% <100.00%> (ø)` | | | [include/flux/op/for\_each\_while.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2Zvcl9lYWNoX3doaWxlLmhwcA==) | `100.00% <100.00%> (ø)` | | | [include/flux/op/map.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/100?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL21hcC5ocHA=) | `93.33% <100.00%> (ø)` | | ... and [12 files with indirect coverage changes](https://app.codecov.io/gh/tcbrindle/flux/pull/100/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

brycelelbach commented 1 year ago

@tcbrindle sounds good - I'll try to get you an updated PR sometime next week.

brycelelbach commented 1 year ago

@tcbrindle On the topic of improving the internal iteration protocol, I'd like to see the protocol evolved to better support parallelism in the future. I'd love to use flux to prototype a next generation parallel algorithms library.

The customization point today, for_each_while, is an early-exit for loop abstraction.

In the non-parallel case, we can implement non-early-exit algorithms like for_each in terms of for_each_while with essentially zero cost. The inverse is not true; we can't implement early-exit algorithms like find in terms of a non-early-exit for_each.

However, in parallel, implementing non-early-exit algorithms like for_each in terms of for_each_while is NOT zero cost. To parallelize, you'd have to have some sort of concurrent rollback or cancellation mechanism. On some platforms, like GPUs, it's actually more efficient to implement parallel early-exit algorithms like find by NOT actually exiting early.

Some options:

tcbrindle commented 1 year ago

I'd love to use flux to prototype a next generation parallel algorithms library.

That would be awesome!

Some options:

  • Have two customization points, for_each and for_each_while.
  • Add a compile time parameter to for_each_while that indicates whether or not early-exit is needed.

I guess in general we'd want some way to indicate that we want parallel execution (for all algorithms where it's possible), perhaps something like the standard execution policies? In that case we could say that for_each_while (or iterate_while), when passed a non-sequential execution policy, doesn't do early termination. Would that be sufficient?

tcbrindle commented 3 weeks ago

This has bitrotted, and I have better ideas for how to solve the problem