Closed bjosv closed 1 year ago
This PR also might close #34 since it gives hints how to build libco with MSVC.
Thank you for putting in all this work! I agree that having automated testing for subtle code like this is a good idea. Also, I'm sorry for taking so long to get around to reviewing your changes.
...all backends except
ppc.c
.
Why that backend specifically? Is it not supported in QEMU or something?
...at least the described warnings can be seen in the CI logs.
The CI run you link to is a successful one, I'd be interested to see one that reports valgrind failures so I know what to look for.
I'll ask around to see if there's any reason why Near would have used mmap()
at a specific address, or why he might have used mmap()
at all instead of plain malloc()
or similar, which would simplify a lot of these changes I think.
co_serializable means copying the stacks around, including a bunch of pointers. Trying to move then resume a stack will work badly.
But in this case, no stack lives longer than that allocation in that process, so it looks completely unnecessary to me. Maybe a leftover from testing or something.
I have now updated the PR with your proposal to handle unexpected successes, good idea!
I have also added Windows x86 which I missed, and changed test_serialization
to use malloc()
instead.
Using malloc enables it to be built on windows as well.
...all backends except
ppc.c
.Why that backend specifically? Is it not supported in QEMU or something?
It's not supported out-of-the-box by run-on-arch-action which I use to run QEMU on Github Actions, but it seems that QEMU should be able to run big-endian powerpc's. Debian 9 dropped support for it, but it might be possible to run with older releases somehow.. I can do some attempts but I guess it not that straight on.
The CI run you link to is a successful one, I'd be interested to see one that reports valgrind failures so I know what to look for.
The "problem" to visualize what #39 fixes is that the existing tests only indicates warnings in the log and Valgrind is not including this finding in the errorcode / found errors. We might be able to change the tests a bit so that invalid reads or similar also are found due to the stack switching.
Here is an log for a run where #39 is applied to compare to a log without #39
Like:
==3421== Command: ./test_args
==3421==
==3421== Warning: client switching stacks? SP change: 0x1ffefff7e8 --> 0x4de8c90
==3421== to suppress, use: --max-stackframe=[13](https://github.com/bjosv/libco/actions/runs/4873194062/jobs/8692413899#step:6:14)7340480344 or greater
==3421== Warning: client switching stacks? SP change: 0x4de8c78 --> 0x1ffefff7e8
==3421== to suppress, use: --max-stackframe=137340480368 or greater
==3421== Warning: client switching stacks? SP change: 0x1ffefff7e8 --> 0x4df8cd0
==3421== to suppress, use: --max-stackframe=1373404[14](https://github.com/bjosv/libco/actions/runs/4873194062/jobs/8692413899#step:6:15)744 or greater
==3421== further instances of this message will not be shown.
cothread parameterized function example
To detect the "Warning: client switching stacks?" in CI (for which valgrind just gives a success exit code 0) we'd need to run something like valgrind ./test_args | grep -qvF Warning
, which returns a non-zero exit code if Warning is present in the output.
I haven't found any way to make valgrind treat this warning as an error.
I have also added Windows x86 which I missed, and changed
test_serialization
to usemalloc()
instead.
Oh, I was going to do that, to save you the effort of messing with code you hadn't written. Thanks for stepping up!
Debian 9 dropped support for it, but it might be possible to run with older releases somehow.. I can do some attempts but I guess it not that straight on.
Debian has an extensive suite of cross-compilers you can install, and QEMU can run static binaries... it'd be nice to have a script to run all these multi-platform tests locally instead of having to push them to CI, but that doesn't have to block this PR merging.
The "problem" to visualize what #39 fixes is that the existing tests only indicates warnings in the log and Valgrind is not including this finding in the errorcode / found errors.
That's frustrating, but it doesn't seem like Valgrind is very amenable to automated test reporting like this.
Perhaps we should have a valgrind-wrapper.sh
next to build.sh
that does these checks for us, so we don't make the CI scripts even more gnarly. I'll see what I can come up with.
OK, here's a shell-script which you can save as valgrind-wrapper.sh
which I think will do the right thing:
#!/bin/sh
FAILURE=99
SUCCESS=0
expect_failure=0
if [ "$1" = "--expect-failure" ]; then
expect_failure=1
shift
fi
valgrind --error-exitcode="$FAILURE" --log-file="valgrind.log" "$@"
exitcode=$?
# Report the log messages just in case there was anything interesting
cat "valgrind.log" >&2
if [ "$exitcode" -eq "$SUCCESS" ]; then
# Valgrind didn't reject our code, but if it warned
# about stack-switching, we care about that too.
# None of our tests should use enough stack
# to trigger this by accident.
if grep -qlF "Warning: client switching stacks?" "valgrind.log"; then
echo "Test confused Valgrind by switching stacks" >&2
exitcode=$FAILURE
fi
fi
if [ "$expect_failure" -eq 1 ]; then
if [ "$exitcode" -eq "$SUCCESS" ]; then
echo "Test passed, but was expected to fail?" >&2
exitcode=$FAILURE
else
echo "Task failed successfully" >&2
exitcode=$SUCCESS
fi
fi
exit "$exitcode"
If you run ./valgrind-wrapper.sh ./test_args
it prints the Valgrind output then:
Test confused Valgrind by switching stacks
...and exits with code 99. If you run ./valgrind-wrapper.sh --expect-failure ./test_args
it prints the Valgrind output then:
Test confused Valgrind by switching stacks
Task failed successfully
...and exits with code 0. That should simplify the CI scripts a bit.
Looks nice, I will add it.
"Task failed successfully" sounds funny. How about "Task failed as expected" instead?
Added the wrapper-script which cleaned up a bit.
Now #39 will need to flip some --expect-failure
flags which is a helpful indication.
A link to the CI run.
Thanks for all your work on this!
Adds CI builds and test runs for all backends except
ppc.c
.The test examples are run with
valgrind
where possible and the problems found are added as comments for further investigations. #39 might fix some, but at least the described warnings can be seen in the CI logs.Some additional changes were needed to get the test examples working on x86 and Windows. Maybe there is a good reason to use fixed address in test_serialization that I don't know, but it gives problems for x86 and ASan. I'm open for suggestions.
Example run