Low level optimizations for contiguous sequences

DeveloperPaul123 commented 1 year ago

Description

Adds low level optimizations to address #57

Optimizations

[x] std::memset with fill()
[x] std::memcmp with equal()
[x] std::memcmp with compare()
[x] std::memchr with find()
~~std::memmem with search()~~

DeveloperPaul123 commented 1 year ago

@tcbrindle I filled this PR to see if I'm on the right track (any feedback much appreciated) and also to make it know that the work is in progress for that particular issue.

tcbrindle commented 1 year ago

Thanks for working on this! It looks like there's a syntax error which is tripping up the non-Windows builds, but the general approach looks good to me 👍

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 94.59% and project coverage change: +0.12% :tada:

Comparison is base (051dce9) 97.58% compared to head (96d201f) 97.70%. Report is 30 commits behind head on main.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #103 +/- ## ========================================== + Coverage 97.58% 97.70% +0.12% ========================================== Files 66 67 +1 Lines 2276 2398 +122 ========================================== + Hits 2221 2343 +122 Misses 55 55 ``` | [Files Changed](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle) | Coverage Δ | | |---|---|---| | [include/flux/core/utils.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L2NvcmUvdXRpbHMuaHBw) | `100.00% <ø> (ø)` | | | [include/flux/op/split\_string.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL3NwbGl0X3N0cmluZy5ocHA=) | `100.00% <ø> (ø)` | | | [include/flux/op/fill.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2ZpbGwuaHBw) | `94.11% <91.66%> (-5.89%)` | :arrow_down: | | [include/flux/op/find.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2ZpbmQuaHBw) | `96.15% <93.33%> (-3.85%)` | :arrow_down: | | [include/flux/op/equal.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2VxdWFsLmhwcA==) | `92.00% <94.11%> (+10.18%)` | :arrow_up: | | [include/flux/op/compare.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL2NvbXBhcmUuaHBw) | `97.22% <96.15%> (-2.78%)` | :arrow_down: | | [include/flux/op/output\_to.hpp](https://app.codecov.io/gh/tcbrindle/flux/pull/103?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle#diff-aW5jbHVkZS9mbHV4L29wL291dHB1dF90by5ocHA=) | `94.44% <100.00%> (+1.11%)` | :arrow_up: | ... and [9 files with indirect coverage changes](https://app.codecov.io/gh/tcbrindle/flux/pull/103/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tristan+Brindle)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

DeveloperPaul123 commented 1 year ago

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

DeveloperPaul123 commented 1 year ago

Just looking for an initial round of feedback; I'll have to address the build issues.

tcbrindle commented 1 year ago

I've noticed that with the any_of move, split() seems to have broken? I'm getting an exception with this code in test_split.cpp

{
    auto sv = "the quick brown fox"sv;

    auto split = flux::split(flux::ref(sv), ' ');

    using S = decltype(split);

    static_assert(flux::multipass_sequence<S>);
    static_assert(flux::contiguous_sequence<flux::element_t<S>>);

    static_assert(flux::multipass_sequence<S const>);
    static_assert(flux::contiguous_sequence<flux::element_t<S const>>);

    STATIC_CHECK(check_equal(std::move(split).map(to_string_view),
            std::array{"the"sv, "quick"sv, "brown"sv, "fox"sv}));
}

This is caused by the new specialisation of find (which split uses) returning the wrong thing in some cases. See the review comments above (the any_of thing is just a coincidence).

DeveloperPaul123 commented 1 year ago

I believe I've addressed most, if not all of your comments. Unit tests are also all passing for me. Please take another look when you have a moment. Thanks!

DeveloperPaul123 commented 1 year ago

Whoops! I missed that, thanks!

DeveloperPaul123 commented 1 year ago

Great!

Is there a need for more unit tests to make codecov happy?

tcbrindle commented 1 year ago

Looking at the CodeCov complaints, most of them seem to be because it doesn't understand std::is_constant_evaluated() This is technically a runtime branch, but we can't ever test it in a way that will make CodeCov happy so we'll just ignore it.

The other complaint is that we're not testing the short-circuit case in equal when both sequences are sized but their sizes differ... but that definitely looks like it's being tested so my guess is that the compiler has inlined the it into the test function, so as far as CodeCov is concerned the line is never executed.

So I think we're fine :)

tcbrindle commented 1 year ago

Hmmm, there is one thing though.

I was just looking back at the Nanorange issue to double-check that the conditions are now correct, and I'd forgotten about this comment from Eric:

When using the mem functions from an algorithm used with raw pointers as iterators, be sure to check the pointers for null before calling the mem functions. IIRC, they don't appreciate being called with null.

Although flux::data() "should" never return nullptr, there are a couple of ways that this might happen if developers get things wrong.

Would you mind adding lines equivalent to FLUX_ASSERT(flux::data(seq) != nullptr) before the calls to the various std::memxxx functions, so we can try to avoid UB even in the case of buggy sequence implementations?

Sorry to add more work at the last minute, I should have thought about it earlier 😞

DeveloperPaul123 commented 1 year ago

Done! No problem at all. But note that tests now fail. Do we need to take into account flux::empty<> sequences?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-libflux.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
drop
-------------------------------------------------------------------------------
flux/test/test_drop.cpp(104)
...............................................................................

flux/test/test_drop.cpp(104): FAILED:
due to unexpected exception with message:
  flux/include\flux/op/equal.hpp:62: Fatal error: assertion    
  'flux::data(seq2) != nullptr' failed

===============================================================================
test cases:  74 |  73 passed | 1 failed
assertions: 556 | 555 passed | 1 failed

tcbrindle commented 1 year ago

Done! No problem at all. But note that tests now fail. Do we need to take into account flux::empty<> sequences?

Thanks! I'd completely forgotten about flux::empty... so much for "nullptr should never happen"! (But it's good that we caught it)

So I guess what we really need to do is check the size of the sequence first... if that is zero then we can exit early, as there's no work to do. Otherwise, we get the data pointer and assert that it's not nullptr, and then finally call the C library function. That is, something like:

auto size = flux::usize(seq);
if (size == 0) {
    return;
}
auto data = flux::data(seq);
FLUX_ASSERT(data != nullptr);

std::memset(data, val, size);

(adjusted as appropriate for each case).

This turned out to be more complicated than I thought!

DeveloperPaul123 commented 1 year ago

Sounds good, I'll get to work on that.

How would you like to handle the different cases in compare()? What would make the most sense if both sequences are empty? Or if only one sequence is empty?

Maybe I should just look at the unit tests 😅

tcbrindle commented 1 year ago

How would you like to handle the different cases in compare()? What would make the most sense if both sequences are empty? Or if only one sequence is empty?

For compare(), I think we get the right behaviour if we do something like:

auto const seq1_size = flux::usize(seq1);
auto const seq2_size = flux::usize(seq2);
auto min_size = std::min(seq1_size, seq2_size);

int cmp_result = 0;

if (min_size > 0) {
    auto const data1 = flux::data(seq1);
    FLUX_ASSERT(data1 != nullptr);
    auto const data2 = flux::data(seq2);
    FLUX_ASSERT(data2 != nullptr);

    cmp_result = std::memcmp(data1, data2, min_size);
}

That is, we only call the C function if the number of bytes is greater than zero, also making sure that the data pointers are not nullptr.

This is getting messier and messier, sorry!

DeveloperPaul123 commented 1 year ago

Ok I think this is ready now and all tests pass for me!

DeveloperPaul123 commented 1 year ago

I was able to add some tests, but not sure if I covered everything you had in mind. I also address all (I think) your feedback from the latest review. Sorry for the silly oversights at times and thanks for all the feedback!

DeveloperPaul123 commented 1 year ago

Looks like I didn't cover all the cases for compare(). Some help with this would be appreciated.

DeveloperPaul123 commented 1 year ago

Done!

tcbrindle commented 1 year ago

It's done!! 🎉

Thanks so much!

tcbrindle / flux

Low level optimizations for contiguous sequences #103

Description

Codecov Report