Closed rotu closed 4 years ago
Does Ninja output the error messages to stderr
? If not, colcon doesn't know about them and by default hides stdout
output.
As you can see in the attached log files, Ninja redirects the stderr of subprocesses to its stdout.
This seems similar to colcon/colcon-core#236. There would need to be custom logic to redirect the specific Ninja error message to stderr
. Otherwise colcon
can't do anything about it.
So the question is are you interested in implementing custom logic for this? If yes, I would be happy to review a pull request.
Oof. I just looked into what it would take. The assumption that "the only feedback from build tools the user is immediately interested in is stderr" seems buried deep in the architecture of colcon_core.subprocess.run/check_output
. So this is going to be less trivial that I assumed.
Before I start:
Why does colcon run subprocesses in pseudoterminals?
Because invoked tools operate differently when being invoked in a shell / pseudoterminal rather than without. One example is colorization of compiler warnings. Without using a pseudoterminal the compiler coming through e.g. make
is not colorized anymore.
How does colcon prevent stderr output from subprocesses being interleaved in its own stdout? It seems like there are no precautions.
There are only two possible options:
stdout
and stderr
orBecause invoked tools operate differently when being invoked in a shell / pseudoterminal rather than without. One example is colorization of compiler warnings. Without using a pseudoterminal the compiler coming through e.g. make is not colorized anymore.
Okay. So $MAKE_TERMOUT
and $MAKE_TERMERR
would fix the output for Make, if the output of Colcon is indeed going to a terminal and not a log file. It would also work on Windows, where PTYs are not supported. Is there another place this breaks functionality?
So $MAKE_TERMOUT and $MAKE_TERMERR would fix the output for Make,
What about any other invoked command which might have shell/pseudoterminal conditional behavior? The current approach of using PTYs works for all of them - I don't see the reason to replace it with an alternative approach which as in this case is tool specific and we don't know for which other tools we would need something similar.
Better yet, CLICOLOR_FORCE and CLICOLOR, which CMake respects already and should pass on to makefile.
The reason I propose this is because the pty code is fragile. It uses limited resources (pty's are finite), it prevents the process from exiting (this patch makes ctrl-c much more responsive), it causes hangs due to async issues (see comment in code about having to put in an endline to work around bugs), and it makes flow control weird (when a process ends, it should release its resources as it unwinds the stack, but because pty's are conditionally used, there isn't a finally
block to put this cleanup in).
Better yet, CLICOLOR_FORCE and CLICOLOR, which CMake respects already and should pass on to makefile.
And this that doesn't address the various other tools invoked: e.g. bazel
, gradle
, etc.,
I think I'll need to ponder on this. While using a pty will make Bazel use color, it might also make it use curses, which seems like an accident. It also seems like a bad idea in Gradle to use rich output until this is resolved https://github.com/gradle/gradle/issues/6843
We also don’t want build tools to see that a terminal is interactive and wait for user input from a user who never sees the prompt (that may be why CMake ExternalProjects can take so long before failing... because they’re waiting on interactive input)
The check_call() API already exposes the use_pty
option. I would suggest to update code which uses that API (e.g. the CMake build task) to explicitly pass False
while also setting the necessary environment variables to ensure the current behavior is maintained. Once all existing code has been migrated to use use_pty=False
the option can be deprecated.
@rotu Since the discussion has been going a bit of topic (about use_pty
) what about the topic in the title of the ticket and my original question about how to proceed?
This seems similar to colcon/colcon-core#236. There would need to be custom logic to redirect the specific Ninja error message to
stderr
. Otherwisecolcon
can't do anything about it.So the question is are you interested in implementing custom logic for this? If yes, I would be happy to review a pull request.
I don't know the right approach to fixing this but I don't think it should be special treatment for Ninja. I think the best solution would be, when a subprocess fails, to print a postmortem with both the subprocess's stdout and stderr.
I'm not comfortable doing it before use_pty goes away because I think that code is too fragile.
I think the best solution would be, when a subprocess fails, to print a postmortem with both the subprocess's stdout and stderr.
Atm when a task fails the stderr
is being shown which seems to be the information you would want to see (rather than all of stdout
which can be very verbose). That approach works very well in many use cases - not for Ninja obviously.
I don't think the behavior should change to accommodate the fact that Ninja doesn't print error message to stderr
. Maybe reporting this upstream at the Ninja project might be an option.
Te question about the Ninja output (which doesn't print error message to stderr
, same as e.g. msbuild
) has been answered and needs to be either:
The pty discussion was off topic anyway. Please feel free to follow up on those independently.
So I am going to close the ticket since there is nothing to be done in this repo atm. @rotu Please post a link to an upstream ticket as soon as you have created one so future readers can check on the status.
Sadly it looks like the ninja project itself considers "feature" to be not-useful: https://github.com/ninja-build/ninja/issues/1537
Their reasoning is that the error message is not useful without the context of what command caused it and therefore if you care about why it failed, you also care about what command was called when it failed.
The suggested solution there is this:
NINJA_STATUS="[ninja][%f/%t]
filter() {
local l n
while read -r l; do
n=${l#\[ninja\]}
[ "x$l" != "x$n" ] && printf "%s\n" "$n" || printf "%s\n" "$l" >&2
done
}
NINJA_STATUS="[ninja][%f/%t] " ninja | filter
I don't know how difficult this would be to implement (here) but it is simple enough. I'll see if I can figure out how to add this logic.
This ninja PR seems like we could hook into it to get failure output: https://github.com/ninja-build/ninja/pull/1600
BTW, if anyone is looking for a way to use ninja builds and still see build errors, colcon can output stdout for packages during a build:
colcon build --event-handlers console_cohesion+
or for immediate build output (lines interleaved for parallel builds):
colcon build --event-handlers console_direct+
Taken from: https://colcon.readthedocs.io/en/released/user/how-to.html Also mentioned here: https://answers.ros.org/question/363112/how-to-see-compiler-invocation-in-colcon-build/
@alecGraves I took the liberty of referencing your workaround in How can I make colcon output a plain CMake project's output with the ninja generator?. If you have a stackoverflow account I'd be happy to tag you explicitly.
Hi there,
Would to colcon team accept a contribution that enables event-handlers=console_cohesion+
when CMAKE_GENERATOR=Ninja?
We switched to using Ninja because it was faster on most of our builds, but if you forget to add this and the build fails, you don't get the output.
I also don't think this issue is solved; the workaround is not a solution as this behavior is not expected.
When using Ninja as CMake's build system generator, colcon does not show error information:
These errors do show up in build logs:
build_2020-03-13_15-32-21.zip