Open spetrosi opened 1 month ago
$ tmt --version
tmt version: 1.34.0
I found a workaround - when test plan begins, I type alt+Enter
, ctrl+Enter
, Enter
several times and then the plan runs smoothly, bringing me to the interactive menu as expected.
I am providing a simpler reproducer, a plan that provisions 3 nodes, and a simple test. It enough to create two files like below and run tmt try -p plans/reproducer
.
plans/reproducer.fmf
:
summary: Reproduce bug with extra indentation
provision:
- name: reproducer1
role: reproducer
- name: reproducer2
role: reproducer
- name: reproducer3
role: reproducer
discover:
- name: Run on 3 nodes
how: fmf
where: reproducer
filter: tag:reproducer
execute:
how: tmt
tests/reproducer/main.fmf
:
summary: Print date for test purposes
framework: shell
duration: 1m
tag: reproducer
test: date
I noticed what exactly causes this - using where
keyword to specify a role that has more than 2 nodes.
This issue blocks work on multi-node scenarios that use more than 2 nodes.
@spetrosi can you try with the following patch applied?
diff --git a/tmt/log.py b/tmt/log.py
index 66db71c5..18d848b9 100644
--- a/tmt/log.py
+++ b/tmt/log.py
@@ -287,7 +287,7 @@ class LogfileHandler(logging.FileHandler):
# with older Python versions. Since it's not critical to mark the handler as "str only", we can
# ignore the issue for now.
class ConsoleHandler(logging.StreamHandler): # type: ignore[type-arg]
- pass
+ terminator = '\r\n'
class _Formatter(logging.Formatter):
@happz it definitelly helps, but not everywhere. With the reproducer, it helps in the test phase, but the interactive menu is still broken. And after run finishes, I am not able to type in this terminal.
test rediscover tests and execute them again
login log into the guest for experimenting
verbose set the desired level of verbosity
debug choose a different debugging level
discover gather information about tests to be executed
prepare prepare the environment for testing
execute run tests using the specified executor
report provide test results overview and send reports
finish perform the finishing tasks, clean up guests
keep exit the session but keep the run for later use
quit clean up the run and quit the session
>
Is there anything special, or uncommon about your terminal settings? Because I'm unable to reproduce the problem, and since it looks like at some point, \n
stops being enough for your terminal & \r
would be needed, it's not like tmt suddenly stops emitting different line ends. So it's really puzzling and hard to debug, and I'm starting to think whether the problem is in tmt at all.
Never mind, I just reproduced the problem. FTR, it works perfectly fine with container
, but with virtual
, I observe the reported behavior.
Right, I am using virtual. Recently I tried using different terminals or a terminal in vscodium with same results. I too had a feeling that it may an issue on my side.
So, running stty sane
after the test command quits makes things right again. I suspected there must be something breaking the terminal, messing with its settings, like echo and so on, which would make sense, tmt try
is running those commands in interactive mode, test commands share stdout & stderr with their parent, tmt, so it's possible they might mess it up.
A couple of notes to think about:
ssh
& date
could mess up the EOL behavior - this one I'd like to learn because I simply lack the knowledge. And without understanding the cause, who knows what we might break by adding any kind of fix. Or what else would remain broken - why didn't we see this with multihost & interactive before? tmt try
shouldn't play a role, maybe we did not test this combination.stty sane
is a recommended way, I also found tput init
, didn't try that one though.man stty
says this about stty sane
: sane same as cread -ignbrk brkint -inlcr -igncr icrnl icanon iexten echo echoe echok -echonl -noflsh -ixoff -iutf8 -iuclc -ixany imaxbel -xcase -olcuc -ocrnl opost -ofill onlcr -onocr -onlret nl0 cr0 tab0 bs0 vt0 ff0 isig -tostop -ofdel -echoprt echoctl echoke -extproc -flusho, all special characters to their default values
- it's quite a lot of controls, maybe it's simply not worth the effort.stty sane
? Probably not. But what if one of the tests does some ASCII art or runs console app or uses colors, and the other test finishes sooner & runs its stty sane
and messes up the craftfully prepared output of the first test? I bet that one can happen quite easily :)stty sane
- and we're getting back to the first point, if we find out what happened, maybe we could prevent it instead of cleaning up the mess.date
, let's say both 3 test invocations do run some complex ncurses app, with dialogs and input boxes - all 3 would try to emit their output to the same terminal, and that will not be a lovely sight :) One changes color to yellow, the other sets font to underline, the third would be almost finished & it would quit the app and reset the terminal to its original state... This won't have any solution other than telling users "you can have ANSI control characters, multihost and tmt try
with interactive mode enabled - just not all at the same time, pick 2"I would even guess this might be simply the cause here, 3 SSH processes fighting over 1 terminal. Sooner or later, one of them does A and the other one undoes it, and we end up with an inconsistent setup. We can add stty sane
, but I'd say we need to think about whether it's possible to consider it a fix, or whether it's just one part of the solution: stty sane
+ recommendation on what happens with multihost and ANSI control codes + option to disable interactive + ... And even the simplest beakerlib test uses colors and runs over SSH...
With the interactive mode disabled in tmt try
, it just works, no stty sane
needed - they no longer fight over shared terminal... And even basic Beakerlib tests print out colors...
3 trivial Beakerlib tests + stty sane
- yes, eventually, when tests are done, you get tmt try
prompt back, and you can control tmt and input values. But the output is useless.
execute
queued execute task #1: Run on 3 nodes on reproducer1 (reproducer), reproducer2 (reproducer) and reproducer3 (reproducer)
execute task #1: Run on 3 nodes on reproducer1 (reproducer), reproducer2 (reproducer) and reproducer3 (reproducer)
[reproducer3 (reproducer)] started
[reproducer3 (reproducer)] how: tmt
[reproducer3 (reproducer)] order: 50
[reproducer2 (reproducer)] started
[reproducer3 (reproducer)] exit-first: false
[reproducer2 (reproducer)] how: tmt
[reproducer1 (reproducer)] started
[reproducer2 (reproducer)] order: 50
[reproducer1 (reproducer)] how: tmt
[reproducer2 (reproducer)] exit-first: false
[reproducer1 (reproducer)] order: 50
[reproducer1 (reproducer)] exit-first: false
[reproducer2 (reproducer)] test: Print date for test purposes
[reproducer1 (reproducer)] test: Print date for test purposes
[reproducer3 (reproducer)] test: Print date for test purposes
[reproducer2 (reproducer)] warn: Ignoring requested duration, not supported in interactive mode.
[reproducer2 (reproducer)] cmd: ./test.sh
[reproducer1 (reproducer)] warn: Ignoring requested duration, not supported in interactive mode.
[reproducer1 (reproducer)] cmd: ./test.sh
[reproducer3 (reproducer)] warn: Ignoring requested duration, not supported in interactive mode.
[reproducer3 (reproducer)] cmd: ./test.sh
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: TEST PROTOCOL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Package : unknown
beakerlib RPM : beakerlib-1.29.3-5.fc40.noarch
Test name :
Test started : 2024-08-16 20:14:12 UTC
Test finished : 2024-08-16 20:14:13 UTC (still running)
Test duration : 1 seconds
Distro : Fedora release 40 (Forty)
Hostname : reproducer2
Architecture : unknown
CPUs : 2 x Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
RAM size : 1914 MB
HDD size : 117.80 GB
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
Shared connection to 127.0.0.1 closed.
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: TEST PROTOCOL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Package : unknown
beakerlib RPM : beakerlib-1.29.3-5.fc40.noarch
Test name :
Test started : 2024-08-16 20:14:13 UTC
Test finished : 2024-08-16 20:14:13 UTC (still running)
Test duration : 0 seconds
Distro : Fedora release 40 (Forty)
Hostname : reproducer1
Architecture : unknown
CPUs : 2 x Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
RAM size : 1914 MB
HDD size : 117.80 GB
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: TEST PROTOCOL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Package : unknown
beakerlib RPM : beakerlib-1.29.3-5.fc40.noarch
Test name :
Test started : 2024-08-16 20:14:13 UTC
Test finished : 2024-08-16 20:14:13 UTC (still running)
Test duration : 0 seconds
Distro : Fedora release 40 (Forty)
Hostname : reproducer3
Architecture : unknown
CPUs : 2 x Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
RAM size : 1914 MB
HDD size : 117.80 GB
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Setup
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Duration: 0s
:: Assertions: 0 good, 0 bad
:: RESULT: PASS (Setup)
Shared connection to 127.0.0.1 closed.
Shared connection to 127.0.0.1 closed.
[reproducer2 (reproducer)] 00:00:00 errr /tests/reproducer (on reproducer2 (reproducer)) (beakerlib: State 'incomplete') [1/1]
[reproducer1 (reproducer)] 00:00:00 errr /tests/reproducer (on reproducer1 (reproducer)) (beakerlib: State 'incomplete') [1/1]
[reproducer3 (reproducer)] 00:00:00 errr /tests/reproducer (on reproducer3 (reproducer)) (beakerlib: State 'incomplete') [1/1]
[reproducer2 (reproducer)] finished
[reproducer1 (reproducer)] finished
[reproducer3 (reproducer)] finished
summary: 3 tests executed
Many thanks for finding the cause of this. stty sane
- very smart, I didn't know that.
We are running tests in testing farm via https://github.com/marketplace/actions/schedule-tests-on-testing-farm, e.g. in https://github.com/linux-system-roles/ha_cluster/actions/runs/10450150449/job/28933928537.
The log in the testing farm is printed correctly. Maybe in a multihost scenario tmt try
might avoid interactively printing test outputs on the run, but instead save it to a log and then print the whole log, which is formatted correctly, once the test is ready. And then exit to the interactive menu. Just thinking out loud.
Many thanks for finding the cause of this.
stty sane
- very smart, I didn't know that.We are running tests in testing farm via https://github.com/marketplace/actions/schedule-tests-on-testing-farm, e.g. in https://github.com/linux-system-roles/ha_cluster/actions/runs/10450150449/job/28933928537. The log in the testing farm is printed correctly.
That is expected, because TF does not run them in interactive mode, and they don't share stdout/stderr channels.
Maybe in a multihost scenario
tmt try
might avoid interactively printing test outputs on the run, but instead save it to a log and then print the whole log, which is formatted correctly, once the test is ready. And then exit to the interactive menu. Just thinking out loud.
Yes, something along those lines. I would dare to suggest disabling interactive mode in multihost scenarios by default - on top of output formatting, how about shared stdin? All three branches of testing share it, if the user wants to interact with the test, which one of them? The one on guest 1, or the one on guest 2? Maybe the one on guest 3 which is still running? This quickly turns into getting ugly output & not being able to use input reliably either.
Maybe we need some kind of window, like ncurses windows: split the screen, have a window for the menu and a window for each guest, separate their output, and let users switch between windows when talking to their tests. In any case, shared stdout and/or stdin will cause trouble.
Is it possible to use tmt run plans
to do the following?
tmt run plans --filter 'tag:mytag'
, where logs are stored in this case?
After running a plan with multi-host provisioning with 3+ hosts, terminal "breaks", it prints many spaces in the beginning of the lines, and then doesn't work. With two nodes, the output is fine. With >3 nodes, it starts shifting.
Plan from plans/test_plan.fmf:
I am running it with
tmt try -p plans/test_plan centos-stream-9
What my terminal outputs: https://pastebin.com/Ae18humU
Terminal breaks after
Running 'rlImport library'
line