Closed Dobiasd closed 1 month ago
Hello Dobias,
I tried to tackle this issue. Unfortunately I failed, and I cannot study it more in detail because I'm stuck on other projects.
However, I have some details which might be of interest to you.
The test will pass without any issue if I compile for x64 (I have an issue when compiling for ARM64, see below)
On Windows: I'm a bit confused about this error because it is a bit inconsistent: I swear I saw the CI work, then fail for no good reason.
on macOS: the CI works 70% of the times, but can fail: see this run for example.
99% tests passed, 1 tests failed out of 423
Total Test time (real) = 4.63 sec
The following tests FAILED:
52 - container_common_test - interweave (Failed)
Errors while running CTest
pascal@PASCALTHOMEA90C MINGW64 /f/dvp/OpenSource/FunctionalPlus (fix_win_ci_segfault)
$ ./script/ci_build.sh
when trying to diagnose this on my side, I see that it works perfectly in debug mode, But I can reproduce the issue when using "RelWithDebInfo"
I'm not sure this is related to the seg fault you're experiencing but let me show you what I saw:
TEST_CASE("container_common_test - interweave")
{
using namespace fplus;
REQUIRE_EQ(interweave(IntVector({ 1, 3 }), IntVector({ 2, 4 })), IntVector({ 1, 2, 3, 4 }));
REQUIRE_EQ(interweave(IntVector({ 1, 3, 5, 7 }), IntVector({ 2, 4 })), IntVector({ 1, 2, 3, 4, 5, 7 }));
// This test fails in release mode (but will work in Debug mode), when compiling for windows ARM
//REQUIRE_EQ(unweave(IntVector({ 0, 1, 2, 3 })), std::make_pair(IntVector({ 0, 2 }), IntVector({ 1, 3 })));
// Let's show the result
auto u = unweave(IntVector({ 0, 1, 2, 3 }));
std::cout << fplus::show(u) << std::endl; // will show ([1, 3], [0, 2]) instead of ([0, 2], [1, 3])
//REQUIRE_EQ(unweave(IntVector({ 0, 1, 2, 3, 4 })), std::make_pair(IntVector({ 0, 2, 4 }), IntVector({ 1, 3 })));
}
I don't know if this is related.
PS: I might contact you by email in the next days because I'm working hard on a new project which could be interesting for researchers. I'm looking for feedback, and I felt you might have an interesting feedback about it (if you have time)
Wow, thanks a lot for the investigation! :heart:
The flakiness of the macOS CI should be fixed by this commit, which I just pushed. :white_check_mark:
The problem with the Windows tests, however, is still a riddle to me. :monocle_face:
And, sure, looking forward to your email. :+1:
The interweave issue is probably unrelated, and due to a compiler bug.
You should probably move the ++ increment at the end of the loop.
You are probably fighting against the limitation of the GitHub runners. Maybe some limitations in threading.
I just rented a visual studio machine on Azure to test it. With Visual Studio Professional 2022 17.10 on x86 64bits / Azure, the tests do run fine.
Thanks a lot for the interweave fix! :heart_eyes:
Regarding the Windows tests: If they cause so much trouble and additional complexity, maybe we should just drop them completely? Edit: Meh, I now remember there were quite a few problems with MSVC, which required code changes in fplus. Not having test coverage for those would also not be too nice. : /
On a test branch, I reproduced the segfault with a minimal example:
TEST_CASE("show_versions - std_thread")
{
auto t = std::thread([]() { });
t.join();
}
TEST_CASE("show_versions - std_async")
{
auto handle = std::async(std::launch::async, []() { return 1; });
handle.get();
}
...
2/3 Test #2: show_versions - std_thread ....... Passed 0.17 sec
3/3 Test #3: show_versions - std_async ........***Exception: SegFault 0.29 sec
On a test branch, I reproduced the segfault with a minimal example:
Wow, well done!
The next step to be able to gather more feedback from other users would be to set up a very simple repository. It would contain:
#include ...
int main()
{
auto handle = std::async(std::launch::async, []() { return 1; });
handle.get();
}
and it would compile and run this in CI. What do you think? I think I will try it (or you can if you are impatient to see the result: I cannot do it before this afternoon).
The advantage of this version is that if it does not fail, we could then add doctest, and see if the combination doctest + thread is the cause of the issue
Good idea, thanks, and no hurry. ☺️
Ok, it can be reproduced with the simplest possible async code!
Look at the comments on this SO question:
I think the bug was fixed - under my feet - by GitHub while I was reporting it to Stack Overflow. You may want to rerun the CI, it worked on my side
I am not lucky these days with compiler and platform bugs 🙃
Cool minimal example! Perhaps your great report made somebody fix the bug (silently).
Ah, no, it seems the work on the other issue happened before.
Yeah, not lucky indeed. :grimacing: We would have solved the problem too by simply doing nothing for a few days. :sloth:
Closing this issue, since the CI runner is fixed. :heavy_check_mark:
Thanks again for the nice collaboration. Situations like this (especially with long-term contributors like you) are what make maintaining an open-source project worthwhile for me. :heart:
https://github.com/Dobiasd/FunctionalPlus/actions/runs/9435686090/job/25989531257
Looks like something thread-related.