Closed livarcocc closed 3 years ago
From @rmunn on Monday, December 23, 2019 8:18:21 PM
I'm the one who reported that issue with Expecto. I also have a vague memory of seeing this pop up with other dotnet foo
commands such as dotnet build
, but I don't have a solid example since I'm away from my desk for the holidays. Once I get a solid example I'll mention it here.
BTW, for anyone else who experiences this, you can restore your terminal to normal functionality after this issue triggers by typing stty echo
followed by Enter.
From @beevik on Tuesday, December 24, 2019 11:10:15 PM
An additional data point. I encountered the same bug running dotnet test
on MacOS.
I'm seeing this same thing happening with Xunit and Swensen.Unquote.
Approximately 50% of test runs will lock up the terminal (requiring terminal reset
command to unlock) on Gnome terminal on Ubuntu 18.04.
This had been happening for a while (dotnet core 2.1, 3.0, 3.1, not sure about before).
Seeing this on ubuntu 19.10 (linux 5.3.0-26-generic kernel), dotnet core 3.1, XUnit 2.4.1
Also experiencing this very regularly on Ubuntu 18.04.4 LTS, .NET Core 3.1.101
Anyone has any theory why this would be? I saw some escape codes being output into my input (ehm after test run I had some characters in the prompt that I did not type).
@beevik I actually can't see that from the prompt output, could you help me understand what I am seeing when the behavior is incorrect, and what I should be seeing when the behavior is correct, please?
I'm also experiencing this with .NET Core 3.1.102, Ubuntu 18.04.4 LTS.
This is an immense pain in the butt!
NET Core 3.1.101, MacOS 10.14.6, just using dotnet command on built-in terminal
@cniles I understand, can you help me replicate it, and possibly look at the issue description and tell me what I am looking for there?
I just ran into this. Definitely annoying! After reading through this and related issues, I tried to determine minimal reproducible steps.
dotnet test
from within a single test application's directory.dotnet test
from a directory with a solution file and multiple projects (mix of test and non-test projects, all added to the .sln file).dotnet test | sed -n 'l'
), I'm not seeing any immediately incriminating control characters.From the command line, I narrowed everything down to being able to run the following commands to reproduce the issue fairly reliably:
mkdir TestTerminalIssues
cd TestTerminalIssues
dotnet new xunit --name Project1
dotnet new xunit --name Project2
dotnet new classlib --name Project3
dotnet new sln
dotnet sln add Project1 Project2 Project3
dotnet build
dotnet test
Run dotnet test
until the terminal stops behaving as expected. It usually only takes me two or three attempts before the terminal "breaks."
Will try to dig in further, but I hope this helps, in the meantime.
I could not reliably repro this with your repro but it definitely pushed me into the right direction. Or hopefully I am seeing the same thing as you. I can only see it with xUnit. When I run two projects at the same time and it runs for more than 1s so that the progress kicks in, it will lock down silently when one is output errors and the other one is reporting also errors.
It will also lock down when the project reports success and it will print ^[[33;1R
into the console. Does that look familiar?
Stack Trace:
at Project2.UnitTest1.Test3() in /mnt/c/Projects/temp/TestTerminalIssues/Project2/UnitTest1.cs:line 31
Test Run Successful.
Total tests: 1
Passed: 1
Total time: 3.2791 Seconds
^[[33;1R
It does the same thing always.
using System;
using Xunit;
namespace Project1
{
public class UnitTest1
{
[Fact]
public void Test1()
{
System.Threading.Thread.Sleep(1500);
}
}
}
using System;
using Xunit;
namespace Project2
{
public class UnitTest1
{
[Fact]
public void Test1()
{
System.Threading.Thread.Sleep(100);
Assert.Equal(1, 2);
}
[Fact]
public void Test2()
{
System.Threading.Thread.Sleep(100);
Assert.Equal(1, 2);
}
// ... etc. until 50
}
}
Wow, that was interesting. Not the same symptoms I was seeing, though. I ran dotnet test
against the solution in your attached zip and definitely saw some strange results, but I was able to get the terminal to go back to normal after a lot of enter presses.
The symptoms I was seeing when I reproduced were:
dotnet test
, I'd press the Up arrow key to re-run dotnet test
. dotnet test
wouldn't appear in the terminal.reset
would not show reset
in the terminal, but pressing enter afterwards would execute the command).I've reproduced using the project I created with the commands from my previous note and attached all the console output for you to see the symptoms. I followed a pattern, so you can see what happened:
dotnet test
dotnet test
and press enter to run the command again.You'll see at line 162 in the output txt file that all the prompts end up on the same line and you won't see dotnet test
, even though the command is executed.
@Jodacola Sorry I kept the assertion commented out in the test project. In my project please go to the project with a single test and uncomment the assertion.
That will either get the terminal to lock up, and it can only be revived by ctl+c, or will put the cursor in the middle of the screen, or will output ^[[33;1R
and can be revived by pressing and holding enter, but then it will almost always hit the same ^[[33;1R
issue, so I have to start a new session to get the other problems.
From what you describe you are hitting one of these issues
Changing my project to output errors from both (copying the UnitTest1.cs from the one that has a lot of tests to both projects) I can hit the same issue as you did reliably.
Nice! Yes, when uncommenting the assertion, I was able to reproduce the same issue with your project.
When copying Project2/UnitTest1.cs to Project1, I was able to reproduce even more reliably (though not every single time).
I'm running into all three issues you described. With my own, I only ever ran into the one I exhibited with the output file I previously attached.
Been toying with this, for a while now, it has definitely something to do with console colors and moving the cursor. I have small toy app that I will push to travis to see what it will do.
Update: Here is the project. It runs multiple processes and on master it does not move the cursor, and does not use locks (they were probably not a problem anyway). On vstest-like it moves the cursor, and when there are multiple processes running at the same time I see weird output. It did not lock up yet.
Pushed original repro to the travis build and it got stuck. Will try to replace the progress with another that does not move the cursor, hopefully that will fix it. Then we can talk about how it looks.
https://github.com/microsoft/vstest/compare/master...nohwnd:non-cursor-progress here is my draft for non cursor progress, but somehow it lost the ability to use color in travis but works locally. Do you see me something stupid? I tried it without the locks, and with outputting newline after each line, but still nothing...
https://github.com/nohwnd/ProgressToy/compare/original-repro...original-repro-with-new-console and here is comparison between the failing branch, and the one with my new progress, there is single modified file where I added parameter to use custom console dll, the rest is what would normally ship in visual studio / dotnet sdk.
Thank you for the research, @nohwnd! I pulled down your repos and branches and have been playing around with things.
Of note: before playing with ProgressToy and its two branches, I had installed the latest version of the .NET Core SDK in my Ubuntu WSL environment. This caused dotnet test
to start using version 16.5.0 of the tool when running. More on that in a second.
original-repro
branch a bunch of times and couldn't reproduce the issue. Confusing, because I had no problem causing all kinds of issues before...dotnet test
is using Test Execution Command Line Tool version 16.5.0.dotnet test -p:VSTestConsolePath="/usr/share/dotnet/sdk/3.0.101/vstest.console.dll"
against ProjectToy@original-repro and was able to reproduce again (because that uses 16.3.0 in my environment).All this information is to say is that upon using 16.5.0 of the test execution tool, I cannot reproduce, even with the original repro projects.
Could you take a look and see if I'm missing anything here? This just might already be fixed in the latest version of the tool shipping with the latest SDK (I've confirmed 16.5.0 is packaged with 3.1.200 and 3.1.201)!
@Jodacola - Running the Microsoft (R) Test Execution Command Line Tool Version 16.5.0 under Ubuntu (NOT the Ubuntu WSL environment), I was able to reproduce this issue again, by cloning https://github.com/commandlineparser/commandline/ and running dotnet test
a few times. The third run triggered the issue: my typed input stopped being echoed to the terminal until I ran stty echo
to turn the echo flag back on.
Again, that was with version 16.5.0 of the text execution tool, running under .NET Core SDK 3.1.201.
Thanks, @rmunn - fired up an Ubuntu 18.04 VM in Hyper-V, cloned the commandline repository, ran dotnet test
with 16.5.0 of the tool and .NET Core SDK 3.1.201 and reproduced. Going to play around some more.
@nohwnd @Jodacola have you tried play-around with "script" command?
It is spawning new "subshell" session from which it is preserving all output bytes, which appears on its "insides", even ANSI escape codes for colours.
I can see, that the output has many:
ESC[?1hESC=
Which basing on http://ascii-table.com/ansi-escape-sequences-vt-100.php translates to:
But there is no "unset" escape sequence (?).
After some searching over the Internet I found that the opposite sequence might be:
ESC[?1lESC>
Under link above I am attaching captured outputs. One which makes no echo, and one which worked fine. I was doing it on Ubuntu 19.10.
Test Execution Command Line Tool Version 16.5.0
Another interesting thing is that all escape codes has same placement/pattern within both outputs, but apparently sometimes it does not break a console.
So it might not be related to escape sequences.
Dumps from stty -a
as in the main post.
Before - OK state.
After - NOT OK state.
So it might not be related to escape sequences.
I was looking for what kind of thing can adjust options just like "stty" command do, and found out, that a library which provides termios.h
header can do the thing.
After searching all possible dotnet
related source code I found it only in dotnet/runtime
:
https://github.com/dotnet/runtime/search?q=termios&unscoped_q=termios
And one file from dotnet/runtime
drew my attention:
https://github.com/dotnet/runtime/blob/05954773b6ac44d171dded50e79686619fa71522/src/libraries/Native/Unix/System.Native/pal_console.c#L181
Which touches exactly the same things that you can see in the screenshot above.
So this might be related to the most problematic/hard thing in a programming world called: race conditions OR simpler: threads :P
@nohwnd
Looking through the pal_console.c code, I noticed some interesting code interactions:
forChild
parameter is false. (If forChild
is true, the ECHO flag is untouched)..
, ..
, and ...
states).atexit()
handler to restore those terminal settings when the process exits.So what I think might be happening could be something like this interaction between all that code:
dotnet test
process starts up, and saves the current state of the terminal (including whether the echo
flag is on or off).dotnet test
starts upecho=false
in the terminal settingsecho=false
in the terminal settings that it stores in its global variableecho=false
, so any subsequent child processes that start after that point will pick up echo=false
as the settings they should restoreecho=false
, then it restores those terminal settings right before dotnet test
exits, and so the user is left with a terminal that doesn't echo.A complicated race condition, but all it requires is that some child processes complete before three seconds are up, which is not hard to have happen.
@nohwnd @Jodacola have you tried play-around with "script" command?
I did not, I am not that familiar with fixing issues on linux terminal 🙂 Thanks for the suggestion.
@rmunn wow! 👏👏👏
That seems very plausible and very close to what we've been observing. To aid you with your investitgation, what happens process-wise in the repro above is that:
dotnet test
kicks off build for the two test projects.
build builds the projects in parallel and then starts two vstest.console processes in parallel
( each vstest.console starts a test host, testhost only writes back through socket, but not to console, so forget about it )
in each vstest.console a timer starts that will write progress indicator to console after 1s
in each console we start getting back output from testhost and writing it to screen.
at this point we have 4 threads in 2 processes that are trying to write to the console
we lock on most of the output, and Console imho does as well, so we can only do 1 console action per process at 1 time.
I totally believe that what you described is at least part of the answer, but can you share your opinion on these?
v. could be caused by clearing the progress indicator from the screen after writing test results rather than 3rd dot? I am asking because the test summary almost always stopped the execution in the same way. There are different processes running, so I don't believe the timing was the same. I did no see anything special about the summary. I am guessing that there is a call to Clear on the ProgressIndicator when we are done running, that makes this easily reproducible? I saw it happen more consistently with xunit because it writes into console way more than mstest, and in smaller chunks.
in the repro I did not see it trigger when colors were disabled. In the repro where both sides fail it locks up silently, but with the long running test passing (in my repro the assertion is already commented out, so it behaves this way by default), outputs the same characters everytime. Can you explain that, and what the collision is?
@nohwnd To answer your two questions:
The ProgressIndicator.Clear() method accesses the Console.CursorTop property (via a helper class, but that helper class accesses Console.CursorTop), so a Clear() can also trigger this bug. Whichever of CursorLeft (accessed by printing the third dot) and CursorTop (accessed by Clear()) is first accessed will cause TryGetCursorPosition to be called, which will end up setting echo=false
on the terminal. (The second and subsequent calls to CursorTop and CursorLeft will cause TryGetCursorPosition to return a cached value so will not set echo=false a second time).
I don't know which of your repro branches is the one with colors disabled, so I'm having a bit of difficulty answering the second question. But scrolling back, I see that you've had cases where ^[[33;1R
gets printed to the console. That is definitely related to this bug, because that's the escape sequence that the terminal injects into the input to respond to a cursor-position request. (Specifically, the escape sequence is <ESC>
which renders as ^[
, followed by [{row};{col}R
. So when you see ^[[33;1R
it means the cursor was on column 1 of row 33.) The fact that you're seeing this escape sequence means that some subprocess had restored terminal settings that included echo=true
right before the terminal responded to the current cursor position request. That's sort of the opposite symptom of this bug (this one is about incorrectly restoring echo=false
) but is definitely the same race condition at work.
@rmunn thanks a lot for the explanation, I will compile a version that allows disabling the colors and progress by env variables and try it again. I am not 100% sure the colors were needed.
The console is prompted for ForegroundColor a lot, would that possibly cause an issue as well? Does it work similar to querying the position of the cursor - that is by turning echo off and writing a sequence to the terminal?
No, the foreground and background color-setting code does not touch any terminal settings like echo; they just write a single escape sequence and need no other special handling. And, at least in ConsolePal.Unix.cs where this bug is triggering, the code that reads the foreground color doesn't actually query the terminal at all. Instead, it simply remembers what the foreground color was last set to via the ForegroundColor property, and returns that value.
This means that if you deliberately set out to confuse the .NET Core Console class about what the current foreground color is, you can. You could set Console.ForegroundColor
to red, then manually print out the terminal escape sequence for "set foreground color to green". After that, if you were to read the value of the Console.ForegroundColor
property, it would be red, but any text you printed with Console.Write
would actually come out green. I don't consider this a bug, as there's no standard way to query the current terminal colors so it's not reasonable to expect .NET to query the real color if you've bypassed it like that. (Some terminals do implement a "what's the current color?" query sequence, but there's no standard terminfo capability for that query, and it's absolutely NOT in scope for .NET, IMHO, to basically reimplement part of terminfo to prevent people from shooting themselves in the foot.)
Gotcha. Very helpful. The terminal colors get confused often because we do what you described, we write colors from two different processes, and so process1 writes in red, while thinking it is writing in white.
Soooo.... is there any way you'd suggest this echo problem is fixed? I wrote a progress that works by postponing new line until next message comes (or until we are done testing) and from another thread it writes dots to the screen. This requires no cursor movement, does not shift the start of the output, and seems to work nicely overall. But somehow I lost the ability to output color on travisCI, but not locally in Ubuntu WSL. Any clues why that might be? https://github.com/microsoft/vstest/issues/2282#issuecomment-607381917
Only thing I can think of to suggest is that the first dotnet test
process, the one that kicks off two builds and two vstest.console processes, might need to explicitly save the terminal settings before kicking off its first process via Process.Start(). The only way I can see to do that in a safe way is to call Interop.Sys.InitializeTerminalAndSignalHandling explicitly somewhere in the dotnet test
code. This will, on Unix at least, ensure that InitializeTerminal() is called (which reads and stores the current terminal settings), and will set up an atexit
handler to call UninitializeTerminal() on program exit (which will restore the saved terminal settings). And since atexit
calls its registered functions in LIFO order, that means that as long as each process kicked off by dotnet test
was properly awaited and reaped before the master dotnet test
process ends, the final terminal settings restored will be the ones that the user originally had at the time he ran dotnet test
.
@rmunn
https://travis-ci.org/github/nohwnd/ProgressToy/builds/671662254 Pushed here, there are three runs in the same output:
- VSTEST_RUNNER_PROGRESS=0 VSTEST_RUNNER_COLOR=1 dotnet test
- VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=0 dotnet test
- VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=1 dotnet test
The first two that use only progress or only color pass, no problem. The third one that uses both progress and color gets stuck after the summary.
https://travis-ci.org/github/nohwnd/ProgressToy/builds/671665344
BTW, my availability for helping with this issue is about to go to zero for a while, as other tasks are going to take the bulk of my time in the next several weeks. I hope what I've done so far has been helpful.
😭😭😭
@rmunn Thanks for your help! If you have any idea why it would trigger only with colors let us know 🙂
What is going on with that Travis CI log? I'm having a hard time telling where the VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=1 dotnet test
run starts and the other run ends, because it looks a bit interleaved. Does the same failure happen if you only run the VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=1 dotnet test
version, and not the other two?
AFAICT, colors should have no effect on this bug, so I have no idea why the VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=0 run succeeds (doesn't trigger the hang) but the VSTEST_RUNNER_PROGRESS=1 VSTEST_RUNNER_COLOR=1 run hangs. Where is the code that you wrote that handles the VSTEST_RUNNER_COLOR environment variable? I looked all over the branches of your ProgressToy repo and couldn't find it, so I can't tell whether you're accidentally changing something besides color when VSTEST_RUNNER_COLOR is equal to 1.
And my availability is likely to be almost zero in the next few weeks, but I might occasionally have a spare hour to throw at this.
And one other idea. If you're building custom versions of the console code for your repro, try adding code in Native/Unix/System.Native/pal_console.c that logs something to syslog (or just appends to a file with a hardcoded filename if you're not familiar with the Unix syslog API) every time the ConfigureTerminal function is called, and maybe also log something every time InitializeTerminalCore and UninitializeTerminal are called. If you can, log the PID of the process and the state the terminal was set (or restored) into, or at the very least the state of the ECHO flag being set or restored on the terminal. That might give you a visual picture of the race happening: by looking at the log files and tracing whether the ECHO flag is being handled in a way that makes sense.
@rmunn I would make a call to forward this issue to https://github.com/dotnet/runtime
I bet there are exactly the same looking issues all over the GitHub and a work needed is to just collect all links to them and then create an issue in https://github.com/dotnet/runtime repo indicating that this issue has pre-done investigation.
I think .NET Runtime should not in its roots even allow to break terminal outside. So it is not properly guarded OR has a "leakage" in its hermetization and "stack" of threads that are being spawned are not collapsing like it should in a stack manner, but randomly.
Or maybe it is principle of .NET, so a programmer should take care of threads spawning and ending/finishing.
@nohwnd most likely the issue has its roots in https://github.com/dotnet/runtime and an API that handles "per OS" System.Console
and in this case Unix looking like ones.
Unix implementation seems to be used under Linux and MacOS, so I bet that both platforms shares this issue.
This is not about Escape codes opening/enclosing coherency, where a solution is to introduce line buffering/queuing/central collecting/ and to keep opening/enclosing coherency per line.
Most likely the problem is in concurrent/parallel child threads/processes lifetime handling/managing. Maybe somewhere in a code we can have detached threads/processes, so the main process does not await for all jobs to be finished (seems that main process is capable to restore terminal to initial state on its exit by .NET runtime).
I'm also experiencing this with .NET Core 3.1.102, Ubuntu 18.04.4 LTS.
ditto that
Temporary work-around in bash
syntax:
BEFORE=$(stty -g); dotnet test --logger:"console;verbosity=detailed"; stty "${BEFORE}"
or in .bashrc
:
dotnet ()
{
local RET_VAL;
local BEFORE=$(stty -g);
$(which dotnet) "$@";
RET_VAL=$?;
sleep 0.33;
stty "${BEFORE}";
return $RET_VAL
}
But I will just try to notify persons who commited changes related to termios.h
in .NET Runtime to maybe clarify things.
@stephentoub @tmds could you take a look on this issue? In a case of tl;dr
I believe the most accurate are investigations since https://github.com/microsoft/vstest/issues/2282#issuecomment-609494059
To behave similar to Windows, .NET Core disables echoing when Console
is used for input in some way on Linux.
When a .NET Core app exits, it will restore the echoing state it captured at start.
Does dotnet test
wait for all it's children to terminate (especially the children which are given access to terminal by not redirecting streams)?
I'll try to reproduce this week using instructions from https://github.com/microsoft/vstest/issues/2282#issuecomment-605665646.
I think it does wait for them.
When invoking dotnet test
, this leads to MsBuild processes being started:
["/home/tmds/Downloads/dotnet-50-p2/dotnet", "/home/tmds/Downloads/dotnet-50-p2/sdk/5.0.100-preview.2.20176.6/MSBuild.dll", "/home/tmds/Downloads/dotnet-50-p2/sdk/5.0.100-preview.2.20176.6/MSBuild.dll", "/nologo", "/nodemode:1", "/nodeReuse:false"]
No one waits for this process to terminate. When such a process terminates after dotnet test
it restores terminal settings to what it observed at start, which may be echo off.
I think the resolution is to either wait for these processes to terminate, or to not give them access to terminal by redirecting their streams.
@tmds thanks for analyzing this.
@jakubch1 do you plan to solve the CC merge issue by using the msbuild to only build and figure out source paths, and then run multiple test hosts from single vsconsole? I think it would be the best way to go because that would help us here with the echoing, as well as with the parallel output, and with other issue where you can't run multiple different TFMs from single console.
@nohwnd I think we should go with minimal changes to current behaviour and running for individual project and running same project from solution should be almost identical. The issue is that for example coverlet is executing additional steps before tests run and after tests run (for individual project). I think more about approach that all individual project runs will report tests status and attachements to some orchestrator (through port similar like in VS).
any updates on this? It's just a huge pain to have a broken console everytime I run my test suite. I now instinctively write stty echo
after make test
. This is affecting multiple developers in my org.
I find myself wondering, is this problem certainly a VSCode one? Or might it be related to this issue and possibly related to some bug in the code itself. There is a comment on that other issue that suggests possibly an async await bug in the user project itself is the root cause of the problem.
I have to type stty echo frequently too--to the point that I have aliased the command. But if the problem is within the project itself... how does one figure that out? That's what leaves me stumped on both of these problems.
I find myself wondering, is this problem certainly a VSCode one? Or might it be related to this issue and possibly related to some bug in the code itself. There is a comment on that other issue that suggests possibly an async await bug in the user project itself is the root cause of the problem.
I have to type stty echo frequently too--to the point that I have aliased the command. But if the problem is within the project itself... how does one figure that out? That's what leaves me stumped on both of these problems.
definitely not. It happens running dotnet test from my terminal, vscode not involved at all.
definitely not. It happens running dotnet test from my terminal, vscode not involved at all.
But the dotnet test is running your test code. And when I run tests via VSCode... that's the same test code getting run.
I should have said "certainly a .NET problem"... not VSCode. Meaning... at heart is it not a problem with the user's code?
definitely not. It happens running dotnet test from my terminal, vscode not involved at all.
But the dotnet test is running your test code. And when I run tests via VSCode... that's the same test code getting run.
I should have said "certainly a .NET problem"... not VSCode. Meaning... at heart is it not a problem with the user's code?
Ah, saying vscode confused me.
It sounded like this is something related to not restoring the environment correctly when launching multiple processes per https://github.com/microsoft/vstest/issues/2282#issuecomment-614620392
I have multiple projects affected by this, while I'm certain I have bugs, I would be surprised if I managed to cause the same bug multiple times.
This is happening with code that does async operations, but I am sure I am awaiting them all properly in atleast one of the projects lol.
From @beevik on Saturday, December 21, 2019 7:20:52 PM
When running
dotnet test
on ubuntu, the terminal often stops echoing input after the command completes. It's not 100%. Sometimes echoing remains on. I'm using the xunit testing framework.Here are the results of typing
stty -a
after the terminal stops echoing:As you can see, echo is off (
-echo
).This does not appear to be a bug in xunit, as someone has reported a similar issue when using expecto instead.
Ubuntu 18.04.3 .NET Core 3.1 xunit 2.4.1
Copied from original issue: dotnet/sdk#4090