SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
237 stars 89 forks source link

message sequence repeated, two times before the quit command, one time after #645

Closed atncsj6h closed 3 months ago

atncsj6h commented 5 months ago

Unfortunately...

System environment:

Hercules basic info:

HHC01413I Hercules version 4.8.0.11129-SDL-DEV-g5517d322
HHC01414I (C) Copyright 1999-2024 by Roger Bowler, Jan Jaeger, and others
HHC01417I Built using Hercules-Helper (version: v0.9.8-5-gc2450a8-dirty)
HHC01415I Build date: Mar 23 2024 at 17:20:43
HHC01417I Built with: Apple Clang 15.0.0 (clang-1500.3.9.4)
HHC01417I Build type: Mac OS X aarch64 host architecture build

Find attached :

The hercules logs looks good. Cannot say the same for the terminal session log.

thanks in advance enrico

wrljet commented 5 months ago

Enrico,

I just tried it out on Mac M1, macOS Sonoma 14.3.1.

I don't see any duplication of lines in the terminal.

I have noticed before, though, that scrolling back in the macOS Terminal with its scrollbar, vs. using Hercules PageUp/PageDown feature causes the terminal to get confused and that can make some output look duplicated.

How did you capture the hercterm*.txt files?

Bill

atncsj6h commented 5 months ago

Hi Bill!

just tested again ... command K to clear the display entered the hercules start command /Users/enrico/hercules.SDL/herc4x/bin/hercules -f none -o herclog.txt

left the keyboard alone entered the quit command

to be sure I changed to the smallest font size visible so that the terminal data was all there without scrolling and the replicates were plainly visible

the hercterm.txt files were a plain cut and paste from the command window

enrico

wrljet commented 5 months ago

Enrico,

I updated macOS to 14.4, and tried the small font. Still don't see that behavior here.

Can you send me your compiled binaries to try out?

Bill

atncsj6h commented 4 months ago

removed everything and restarted from scratch ... find attached

the console log noconf_con.txt

the hercules log noconf_log.txt

the binaries herc4x.tgz

the config.h from the build directory config_h.txt

wrljet commented 4 months ago

I ran your binaries and I still don't see the duplicated lines.

But, I do now notice a difference that before I overlooked.

You have ooRexx installed and Hercules is picking up the load module for it. And I don't have that on my Mac system.

Can you please tell me how you installed ooRexx, and I will do the same here.

Bill

atncsj6h commented 4 months ago

Hi Bill.. no reason to install oorexx

I usually build hercules with oorexx enabled after that at run time all depends if the ooRexx libraries are available in both cases I get the ooRexx related messages ( the good and the bad ) repeated

but ... if I build hercules without ooRexx support then the oorexx repeated messages disappear
but there are still some leftovers

I will try to build hercules with regina to see what happens and to see if it is only ooRrexx related

anyway there are still some duplicated/triplicated messages I guess that we' ll have to live with the bad design

cheers enrico

PS the ooRexx related messages are evident because there are lots of them

wrljet commented 4 months ago

I wonder if it's related to stdio buffering?

Fish-Git commented 4 months ago

I guess that we' ll have to live with the bad design

What bad design?

JamesWekel commented 4 months ago

The problem of multiple duplicate "screens" in the terminal session log also occurs on Linux systems. The problem is partially related to the change to remove duplicate lines in the log at shutdown.

Fish, I'll work on a proposed solution for your consideration and, Bill, I'm need help on testing on various OS flavors.

Because the terminal session log will never contain all the lines in the log, should the terminal session log be formatted as follows:

  1. startup messages
  2. a marker to see the log
  3. shutdown messages

similar to terminal-log.txt.

Jim

Fish-Git commented 4 months ago

The problem of multiple duplicate "screens" in the terminal session log also occurs on Linux systems.

I believe you mean "lines", not "screens", yes? That is to say, the reported problem is that there are duplicate lines (duplicate messages) appearing on the screen (terminal session), yes? After all, that is the problem Enrico originally reported that we're trying to figure out, yes?

P.S. On what Linux systems other than Apple does the problem occur? It doesn't occur on any of mine or Bill's. Does it occur on yours, James? If so, which ones?

The problem is partially related to the change to remove duplicate lines in the log at shutdown.

Then it needs to be fixed, yes? FWIW, I myself have never seen any duplicate lines/messages appear on my terminal (screen) on any of my Linux systems (or Windows systems either for that matter).

And from the comments I'm reading, it's not occurring on any of Bill's Linux systems either, nor on Bill's Apple system.

The problem appears to only occur on Enrico's specific Apple system. Yes?

There is obviously (from my point of view, based on my reading of this issue) something unusual about Enrico's system (either the way he has it configured or the software he has installed on it, etc) that is affecting his terminal sessions, and we just need to (or HE needs to!) figure out what that unusual thing is.

If it's a PEBKAC, then fine. Once identified, we could then document it for the benefit of our other Apple users. If it's not however, if his configuration and/or software is simply unusual/uncommon but legitimate, then we need to figure out what that is and adjust Hercules accordingly to allow for that situation so that the problem doesn't occur. I.e. We need to identify and fix the problem.

It sounds like a difficult task (challenge), yes, but IMHO that is ultimately what we need to do. And based on the amount of trouble we've had in this part of Hercules over the years, I would say my opinion is justified.

Perhaps it's time to completely redesign our message handling. I have never cared for it since the day Jan originally wrote it. It's stupid IMHO to use pipes just to pass data from thread to thread.

Because the terminal session log will never contain all the lines in the log, ...

Why not? What you see on the screen should match reality. (just as the log should too) I'm not sure I agree with your statement. Why will the terminal "never" match the log? Please explain!

... should the terminal session log be formatted as follows:

  1. startup messages
  2. a marker to see the log
  3. shutdown messages

similar to terminal-log.txt.

No! Of course not! I cannot agree with that! That is just a piss-poor workaround! (purposely hiding of the underlying problem!)

ALL messages issued by Hercules should appear on the screen. And should appear only once. What a silly question.

atncsj6h commented 4 months ago

I spent some time trying to find hints and ... found out that

if the terminal window is large enough to contain ALL the messages before displaying the command line at the bottom of the window there are no duplicates lines

I run a simple test with the menlo font size 8 ( terminal windows holds 110 lines and is pretty unreadable ) and NO duplicates lines appeared

but running with a font size of 16 to have a readable text ( terminal windows holds 45 lines ) the startup messages do not fit into the terminal window the terminal window flickers a few times ( to resync ??? ) before displaying the command line at the bottom of the window and the duplicate lines are there

I installed iTerm2 ( an alternative to apple terminal ) the behavior is the same

PS. the above findings might give an hint to determine if the behavior is fixable or not

for me since i found the reason for the odd behavior the issue can be closed

Fish-Git commented 4 months ago

I presume whatever terminal/screen interface you're using supports ANSI/ISO escape codes, etc? (i.e. ANSI/ISO-6429 (ECMA-48) SGR terminal escape sequences):

From hconsole.h:

#define  ANSI_POSITION_CURSOR  "\x1B[%d;%dH"
#define  ANSI_ERASE_EOL        "\x1B[K"
#define  ANSI_ERASE_SCREEN     "\x1B[2J"

etc...

I'm thinking maybe one or more of the functions defined in hconsole.c -- (set_or_reset_console_mode, set_screen_pos, erase_to_eol, clear_screen, ISO_COLOR, set_screen_color, etc...) -- is failing on your system for some unknown reason? Can you maybe debug your issue from that point of view?

Are you perhaps using a non-standard third party or older/newer terminal (screen) program on your system?

Bottom line: SOMETHING is different on your system to cause it to fail the way it does. As Bill reports, it seems to work fine on his Apple system.

We could, if needed, insert temporary debug code into panel.c to save an exact copy of the binary data written to the console (terminal/screen) in a separate debug file, and then write a simple test program to read that file and write it as-is to the terminal/screen to see if maybe Bill can reproduce/recreate your problem? We could then compare what gets written on your system with what gets written on Bill's system and maybe that will identify the problem or provide a clue?

These are just some thoughts on how we could maybe approach this problem?

wrljet commented 4 months ago

I know it's very unlikely to be the cause, but Enrico's Mac Book Pro, chip: APPLE M3 Pro is two generation newer Apple silicon than my Mac Mini M1.

Bill

wrljet commented 4 months ago

Fish,

Enrico reported above, he tried iTerm2 ( an alternative to the default Apple terminal program ), and got the same behavior.

Somehow it looks to me like an entire stdio buffer full is being repeated, although I didn't count the chars.

Bill

JamesWekel commented 4 months ago

Fish:

First my environment:

  1. hercules is tested on: uname -a Linux NucDev-5 5.15.80-x86 22.11.1 SMP Wed Nov 30 11:34:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

    cat /etc/os-release PRETTY_NAME="Armbian 24.2.1 jammy" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.4 LTS (Jammy Jellyfish)"

  2. hercules is accessed from Windows 11 terminal using SSH.

Second:

You asked:

I believe you mean "lines", not "screens", yes? That is to say, the reported problem is that there are duplicate lines (duplicate messages) appearing on the screen (terminal session), yes? After all, that is the problem Enrico originally reported that we're trying to figure out, yes?

In my case, I mean screens. The terminology get confusing because of the 'copy/paste' process of the terminal/console window. Some systems may insert \n (Enrico macOs?) but the Windows Terminal copy just copies the actual screen.

Here are two test cases: Test 1: hecules started as "hercules -f none -o herclog.txt" with two commands 'help' and quit'. I've used 'help' to add some lines to the panel that are not startup lines. The terminal copy/paste has long, long, long, ... line that is the screen.

test1-terminal.txt

test1-terminal-edited.txt

test1-herclog.txt

Test 2: hecules started as "hercules -f none -o herclog.txt" with three commands 'hao tgt xxx', 'help' and quit' and the associated files.

test2-terminal.txt test2-terminal-edited.txt test2-herclog.txt

I've included this case to highlight the panel command and status line of the first screen

herc =====> HHC00101I Thread id 00007f0d40f73640, prio -1, name 'hao_thread' ended HHC01423I Calling termination routines0000 24M......Z instcnt 0; mips 0.000; I/O 0

showing a timing problem between the shutdown flag being set and panel-cleanup being executed.

My analysis is that clear_screen( stderr ); ( an ANSI_ERASE_SCREEN "\x1B[2J" ) has a side effect of flushing the current screen buffer. How to test this? I added the following routine to panel.c:

    /* write spaces over the whole pane and set position to top left */
    static void blank_panel()
    {
        char blanks[MSG_SIZE];
        int i = 0;

        memset(blanks, ' ' , MSG_SIZE);
        for (i=1; i <= cons_rows; i++)
        {
            set_pos(i,1);
            write_text(blanks, MSG_SIZE);
        }
        set_pos(1,1);
    }

This routine should do the same thing as ERASE_SCREEN, a blank screen with the position at the top/left. With clear_screen(stderr); replaced with blank_panel(); and repeating the second test results in:

test3-terminal.txt test3-herclog.txt

Only one screen (built by panel_cleanup) is shown, but the following lines from the log are missing:

HHC01420I Begin Hercules shutdown HHC00101I Thread id 00007f06d49fe640, prio -1, name 'hao_thread' ended HHC01423I Calling termination routines

I'll let you determine what fix to make.

You asked:

Because the terminal session log will never contain all the lines in the log, ...

Why not? What you see on the screen should match reality. (just as the log should too) I'm not sure I agree with your statement. Why will the terminal "never" match the log? Please explain!

When panel is running, the screen is non-buffered and totally managed by the the panel thread. The panel thread use log_read() to get messages but the thread completely manages where and what is displayed i.e. nothing is written to the file stream except for startup lines, the last screen and shutdown lines. The test terminal copy/paste examples, only show the last screen of the 'help' command. As well, some shutdown messages are not shown due to timing uses between setting the shutdown flag and panel-cleanup() execution.

I'm too new to hercules to comment on changing message handling...

Hope this helps, Jim

JamesWekel commented 4 months ago

Fish:

Changed my mind, I'll submit a pull request for testing.

The request handles two items:

  1. replaces clean_screen() in panel_cleanup with blank_panel() as referenced above.
  2. adds some synchronization, controlled by hsmisc.c, between logger and panel. This synchronization improves terminal console messages at shutdown by capturing some additional messages. A new message 'HHC01421I Shutdown: Panel cleanup complete' has been added.

Jim

atncsj6h commented 4 months ago

Hello Jim could You please post the diffs against the current develop branch so that I can test on APPLE

thank You in advance enrico

Fish-Git commented 4 months ago

Hello Jim could You please post the diffs against the current develop branch so that I can test on APPLE

This is Jim's proposed change (Pull Request):

but please be aware that I have not had time to review it yet (I hope to do that today), so it might change. What you would be testing might not be what eventually gets committed.

wrljet commented 4 months ago

This will need a very thorough test everywhere.

atncsj6h commented 4 months ago

Hi all I applied the patch and things have changed

the terminal output is still partially clobbered, but the part after the quit command is clean

after the patch

HHC01603I quit                                                                                                                                                         
HHC01420I Begin Hercules shutdown                                                                                                                                      
HHC01421I Shutdown: Panel cleanup complete                                                                                                                             
HHC01423I Calling termination routines                                                                                                                                 
HHC00101I Thread id 000000016b41b000, prio 16, name 'Processor CP00' ended
HHC01427I Main storage released
HHC01427I Expanded storage released
HHC01422I Configuration released
HHC01424I All termination routines complete
HHC01425I Hercules shutdown complete
HHC01412I Hercules terminated
HHC00101I Thread id 000000016b633000, prio 16, name 'rubato_thread' ended
HHC00101I Thread id 000000016b527000, prio 16, name 'timer_thread' ended
HHC00101I Thread id 00000001f019fac0, prio 16, name 'panel_display' ended

before the patch

quit
herc =====> HHC01423I Calling termination routines                                                                                                                     
CP00 PSW=0000000000000000 0000000000000000 24M......Z                                                                                 instcnt 0; mips 0.000; I/O      0
HHC01417I Without National Language Support                                                                                                                            
HHC01417I With    CCKD64 Support                                                                                                                                       
HHC01417I With    Transactional-Execution Facility support                                                                                                             
HHC01417I With    "Optimized" instructions                                                                                                                             
HHC01417I With    OPTION_USE_SKAIP_AS_LOCK                                                                                                                             
HHC01417I With    OPTION_SIE2BK_FLD_COPY                                                                                                                               
HHC01417I With    OPTION_IODELAY_KLUDGE                                                                                                                                
HHC01417I With    OPTION_MVS_TELNET_WORKAROUND                                                                                                                         
HHC01417I With    OPTION_SIE_PURGE_DAT_ALWAYS                                                                                                                          
HHC01417I With    OPTION_NOASYNC_SF_CMDS                                                                                                                               
HHC01417I Machine dependent assists: cmpxchg1 cmpxchg4 cmpxchg8 cmpxchg16 hatomics=C11                                                                                 
HHC01417I Running on: enrico-mbp.local (Darwin-23.4.0 arm64) LP=12, Cores=12, CPUs=1                                                                                   
HHC01417I Built with crypto external package version 1.0.0.52-ga5096e5                                                                                                 
HHC01417I Built with decNumber external package version 3.68.0.102-g3aa2f45                                                                                            
HHC01417I Built with SoftFloat external package version 3.5.0.105-g4b0c326                                                                                             
HHC01417I Built with telnet external package version 1.0.0.63-g729f0b6                                                                                                 
HHC00100I Thread id 00000001f019fac0, prio 16, name 'impl_thread' started                                                                                              
HHC00100I Thread id 000000016f1f3000, prio 4, name 'logger_thread' started                                                                                             
HHC00017W Crash dumps NOT enabled                                                                                                                                      
HHC00007I Previous message from function 'impl' at impl.c(1258)                                                                                                        
HHC00018W Hercules is NOT running in elevated mode                                                                                                                     
HHC00007I Previous message from function 'impl' at impl.c(1267)                                                                                                        
HHC02323W This build of Hercules has only partial TCP keepalive support                                                                                                
HHC00007I Previous message from function 'impl' at impl.c(1313)                                                                                                        
HHC02320W Not all TCP keepalive settings honored                                                                                                                       
HHC00007I Previous message from function 'impl' at impl.c(1357)                                                                                                        
HHC00150I Crypto module loaded (C) Copyright 2003-2016 by Bernard van der Helm                                                                                         
HHC00151I Activated facility: Message Security Assist                                                                                                                  
HHC00151I Activated facility: Message Security Assist Extension 1, 2, 3 and 4                                                                                          
HHC17528I REXX(OORexx) VERSION: REXX-ooRexx_5.1.0(MT)_64-bit 6.05 25 Jan 2024                                                                                          
HHC17529I REXX(OORexx) SOURCE:  DARWIN                                                                                                                                 
HHC17525I REXX(OORexx) Rexx has been started/enabled                                                                                                                   
HHC00112W Thread CPU Time is NOT available.                                                                                                                            
HHC00007I Previous message from function 'configure_cpu' at config.c(1087)                                                                                             
HHC00100I Thread id 000000016f40b000, prio 16, name 'Processor CP00' started                                                                                           
HHC00100I Thread id 000000016f517000, prio 16, name 'timer_thread' started                                                                                             
HHC00811I Processor CP00: architecture mode z/Arch                                                                                                                     
HHC02385W CPUMODEL 0586 does not technically support TXF                                                                                                               
HHC00007I Previous message from function 'txf_model_warning' at transact.c(2562)                                                                                       
HHC17736W TXF: TIMERINT 50 is too small; using default of 400 instead                                                                                                  
HHC00007I Previous message from function 'txf_set_timerint' at transact.c(2587)                                                                                        
HHC00100I Thread id 000000016f623000, prio 16, name 'rubato_thread' started                                                                                            
HHC01603I quit                                                                                                                                                         
HHC00101I Thread id 000000016f40b000, prio 16, name 'Processor CP00' ended
HHC01427I Main storage released
HHC01427I Expanded storage released
HHC01422I Configuration released
HHC01424I All termination routines complete
HHC01425I Hercules shutdown complete
HHC01412I Hercules terminated
HHC00101I Thread id 000000016f623000, prio 16, name 'rubato_thread' ended
HHC00101I Thread id 000000016f517000, prio 16, name 'timer_thread' ended
HHC00101I Thread id 00000001f019fac0, prio 16, name 'panel_display' ended

enrico

P.S. the display is different because - to shorten the dispays I commented out the call to DisplayOptions() in hRexx.c I do not think that changes the overall flow of things

Fish-Git commented 4 months ago

See my comment in Jim's Pull Request.

JamesWekel commented 4 months ago

Fish, thanks for posting the diff for Enrico.

Bill, I concur that this patch needs testing everywhere as it is trying to coordinate/synchronize shutdown across multiple threads. It is very easy to get it wrong! I got it wrong a few times before this version of the patch!

Enrico, At shutdown (quit), the last panel of messages are printed to the console prior to the remaining shutdown messages. If the only command is quit, the panel just contains startup messages which will then be reprinted. For this scenario, the patch only removes the 2nd set of screen line(s) when you were previously getting 3 sets of screens.

Jim

Fish-Git commented 4 months ago

See my new comment in Jim's Pull Request:

Fish-Git commented 3 months ago

See my latest comment in Jim's Pull Request too:

Fish-Git commented 3 months ago

Closed by commit e59857f524d2e36cbedebc983b6a35f57f78b02e