Closed perillo closed 5 months ago
Thanks a lot, looking good!
As for your arch issue, could it be some Linux security change that was made? Before, the yama scope was one thing that affected kcov (and gdb), see https://www.kernel.org/doc/Documentation/security/Yama.txt, although that typically was only for attachment.
Here, you're tracing a system binary (python), and maybe there are some new restrictions as to what can be done with those? Sort of like what MacOs does, to prevent people debugging the app store etc...
Actually, the new test fails in the CI run (not sure why it wasn't run for your PR though?).
And I also actually saw an issue with the path resolving with the new master. First before the change:
HEAD is now at 692d41c Merge pull request #422 from SimonKagstrom/fix-freebsd-python2-tests
Simons-iMac-7:kcov ska$ rm -rf /tmp/kcov && kcov /tmp/kcov kcov
Usage: kcov [OPTIONS] out-dir in-file [args...]
Where [OPTIONS] are
[...]
Simons-iMac-7:kcov ska$ git checkout master
Previous HEAD position was 692d41c Merge pull request #422 from SimonKagstrom/fix-freebsd-python2-tests
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
# (Was built in another terminal window before this line)
Simons-iMac-7:kcov ska$ rm -rf /tmp/kcov && kcov /tmp/kcov kcov
kcov: error: posix_spawn
kcov: error: Can't start/attach to /Users/ska/projects/build/kcov/src/kcov
When I manually pass the full path to the binary to trace, it works, but not the path lookup (sort of what you discussed in the other bug report, but no segv).
I received an email with the workflow failure, having configured github to run workflows in my fork. Not sure why the workflow was not run in the upstream repository; maybe a bug in github?
The link is https://github.com/perillo/kcov/actions/runs/8301507902.
As for security, I also tried setting kernel.yama.ptrace_scope
to 0 a few days ago, but there was still an error.
Did you remember if with the current master there are no errors?
You can try to add the workflow_dispatch
event (https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow). It is useful for me, since I cat test my PR from my fork.
It master has no errors, IMHO it is better to revert this PR.
The issue with the executable being a system executable (installed in PATH) can be reproduced with a safe commit:
> git checkout 895f21d0
Previous HEAD position was 692d41c Merge pull request #422 from SimonKagstrom/fix-freebsd-python2-tests
HEAD is now at 895f21d ci: Fix warning about node12 / node16 (actions/checkout@v3)
> build/bin/kcov /tmp/kcov zig4-dev
^Cfish: Job 1, 'build/bin/kcov /tmp/kcov zig4-d…' terminated by signal SIGKILL (Forced quit)
I used an executable installed in ~/.local/bin
. It started to using all the cpu, running for a long time (possibly forever).
It seems the binary crashes only if I use kcov
, but this should be the same issue (just that kcov
crashes)
First, I checkout at 692d41c, but I can not reproduce your output.
Instead:
> build/bin/kcov /tmp/kcov zig4-dev
^Cfish: Job 1, 'build/bin/kcov /tmp/kcov zig4-d…' terminated by signal SIGKILL (Forced quit)
zig4-dev
is installed in PATH
(in ~/.local/bin
).
The command runs using 100 of cpu, probably forever?
It seems the process crashes only when using kcov.
In a previous run there were also kcov-system-daemons using CPU, but this may be caused by the daemon still alive after a test.
I can confirm this issue with a commit before my changes.
UPDATE: the bug was caused by the use of a Zig executable.
Sorry for the previous messages; in the first case I was not sure I rebuild kcov
and in the second case I used a Zig executable.
I checkout at 692d41c, but I can not reproduce your output.
Instead, running build/bin/kcov /tmp/kcov test-bin
works correctly.
As for the bug in master (assuming master after this PR was merged), on my system it works correctly.
The master bug is MacOS-specific, this fixes it:
diff --git a/src/engines/mach-engine.cc b/src/engines/mach-engine.cc
index 05bb434..8d2d7a9 100644
--- a/src/engines/mach-engine.cc
+++ b/src/engines/mach-engine.cc
@@ -257,7 +257,7 @@ private:
// Fork the process, in suspended mode
auto& conf = IConfiguration::getInstance();
auto argv = conf.getArgv();
- rv = posix_spawn(&m_pid, argv[0], nullptr, &attr, (char* const*)argv, environ);
+ rv = posix_spawn(&m_pid, executable.c_str(), nullptr, &attr, (char* const*)argv, environ);
if (rv != 0)
{
error("posix_spawn");
Basically it ignored the argument passed to it (incorrectly), which no longer works after your fixes. I'll correct that, and it also explains why I saw it and not you.
Not sure about the test though. After the fix above, it also works on my Mac, so I'm not sure what's going on with the CI runs.
This change also removed calling get_real_path
for the binary-path
configuration value; instead I called get_real_path
when binaryPath
is used when hashing the directory name.
This may change the hashed value and may cause conflicts, but I'm not sure, since this case is not tested.
This patch should restore the original behavior:
diff --git a/src/configuration.cc b/src/configuration.cc
index a778d72..a418d3c 100644
--- a/src/configuration.cc
+++ b/src/configuration.cc
@@ -482,7 +482,9 @@ public:
std::string binaryName;
std::string binaryPath = path;
- (void)look_path(path, &binaryPath);
+ if (look_path(path, &binaryPath) < 0)
+ binaryPath = get_real_path(path);
+
std::pair<std::string, std::string> tmp = split_path(binaryPath);
setKey("binary-path", tmp.first);
@@ -492,7 +494,7 @@ public:
{
setKey("target-directory",
fmt("%s/%s.%08zx", outDirectory.c_str(), binaryName.c_str(),
- (size_t)hash_file(get_real_path(binaryPath))));
+ (size_t)hash_file(binaryPath)));
}
else
{
It calls get_real_path
when the executable is not in PATH
and may exist in the current directory.
Actually, the new test fails in the CI run (not sure why it wasn't run for your PR though?).
@SimonKagstrom I have fixed the test failing in #428.
Currently
getopt_long
is configured to support mixing options and positional arguments, butkcov
don't use the special "--" argument to tellgetopt_long
when to stop parsing the executable arguments. Instead the implementation tries to scan the arguments to find an executable, but the implementation is flawed because there are two required positional arguments, and both may be executable.In order to handle positional arguments correctly, configure
getopt_long
to stop parsing after encountering a positional argument.This change greatly simplified the code, since the pre scanning can be removed. However it may break the workflow of some users.
Add a regression test for issue #414
NOTES
The current implementation does not break the cli interface, but some user workflow may break. In this case an alternative implementation is to configure
getopt_long
to report the positional arguments, too (using "-" instead of "+" as theoptstring
prefix).Also note that the current configuration of
getopt
is the original UNIX behavior.ISSUES
This PR took longer than expected because I started to fix some issues with the test suite.
Additionally, after a system update (on Arch Linux) I started to see more failures. These failures can be reproduced on the original implementation, commit 895f21d0.
Another issue is the regression test that I added. The test fails with:
The command works correctly when running it maually, so I assume there is an issue in the test suite environment.
P.S.
Issue #414 has been closed incorrectly. This PR is responsible to close it.