SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
245 stars 92 forks source link

Intermittent wait after configuration file processing on Linux. #120

Closed Peter-J-Jansen closed 6 years ago

Peter-J-Jansen commented 6 years ago

It has been observed that sometimes (i.e. not always), during SDL-Hyperion startup on Linux, towards the end of the configuration file processing, but before the panel displays, a wait can occur, lasting a few minutes (or longer or shorter). This wait may resolve itself. The current workaround is to try waiting it out, or do a quit, and retry to start again.(*)


(*) Or... clicking the mouse outside the terminal window where Hercules is running, then clicking inside the window, then outside, then inside, etc. After a few mouse clicks, the problem seems to resolve itself within a few seconds. This is on my CentOS 6.10 VMware virtual machine. -- Fish.

Peter-J-Jansen commented 6 years ago

Trying to help reasearching this mysterious wait, I've been experimenting with the Ubuntu "top" command.

Whilst the wait is going on the top command shows this, with field WCHAN added:

top - 15:39:21 up 23:24,  1 user,  load average: 0.19, 0.51, 0.42
Threads:  87 total,   0 running,  87 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  0.1 sy,  0.1 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16106100 total, 12838180 free,  1103896 used,  2164024 buff/cache
KiB Swap:  8189948 total,  8189948 free,        0 used. 14597816 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                WCHAN                          
19544 root      -9   0 12.776g  21416   5780 S  4.7  0.1   0:05.13 timer_thread           hrtimer_nanosleep              
19540 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.62 impl_thread            random_read.part.34            
19541 root      -6   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 logger_thread          pipe_wait                      
19542 root      -3   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 watchdog_thread        hrtimer_nanosleep              
19543 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP00         futex_wait_queue_me            
19545 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP01         futex_wait_queue_me            
19546 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP02         futex_wait_queue_me            
19547 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP03         futex_wait_queue_me            
19548 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP04         futex_wait_queue_me            
19549 root      -4   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 Processor CP05         futex_wait_queue_me            
19550 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 shared_server          poll_schedule_timeout          
19551 root      -6   0 12.776g  21416   5780 S  0.0  0.1   0:00.02 console_connect        poll_schedule_timeout          
19554 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00                        tun_do_read.part.50            
19591 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 CTCE 0E40 Liste        inet_csk_accept                
19592 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 CTCE 0E41 Liste        inet_csk_accept                
19593 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 CTCE 0E42 Liste        inet_csk_accept                
19594 root      -7   0 12.776g  21416   5780 S  0.0  0.1   0:00.00 CTCE 0E43 Liste        inet_csk_accept                

After the wait has mysteriously ended, noting that panel_display has replaced impl_thread :

top - 15:44:52 up 23:30,  1 user,  load average: 0.43, 0.37, 0.37
Threads:  87 total,   0 running,  87 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.5 us,  0.3 sy,  0.1 ni, 99.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16106100 total, 12793104 free,  1149004 used,  2163992 buff/cache
KiB Swap:  8189948 total,  8189948 free,        0 used. 14552748 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                WCHAN                          
19544 root      -9   0 12.839g  21472   5780 S  5.3  0.1   0:20.34 timer_thread           hrtimer_nanosleep              
19540 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.62 panel_display          poll_schedule_timeout          
19541 root      -6   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 logger_thread          pipe_wait   
19542 root      -3   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 watchdog_thread        hrtimer_nanosleep              
19543 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP00         futex_wait_queue_me            
19545 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP01         futex_wait_queue_me            
19546 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP02         futex_wait_queue_me            
19547 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP03         futex_wait_queue_me            
19548 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP04         futex_wait_queue_me            
19549 root      -4   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 Processor CP05         futex_wait_queue_me            
19550 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 shared_server          poll_schedule_timeout          
19551 root      -6   0 12.839g  21472   5780 S  0.0  0.1   0:00.10 console_connect        poll_schedule_timeout          
19554 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00                        tun_do_read.part.50            
19591 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 CTCE 0E40 Liste        inet_csk_accept                
19592 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 CTCE 0E41 Liste        inet_csk_accept                
19593 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 CTCE 0E42 Liste        inet_csk_accept                
19594 root      -7   0 12.839g  21472   5780 S  0.0  0.1   0:00.00 CTCE 0E43 Liste        inet_csk_accept    

I hope this may perhaps give some clue as to what the problem could be.

Cheers,

Peter

Fish-Git commented 6 years ago

Thank you for your preliminary analysis, Peter! Adding the WCHAN field to your top display was a pure stroke of genius! (in my opinion)

After a quick Google search on "random_read.part" (impl_thread) and seeing the words "entropy", "urandom" and "secret" appear in the only two search results that Google displayed, I am now 99.99% sure I know what is causing the delay! :)

Let me do a quick test to confirm my suspicion and I'll get back to you with my results!

Fish-Git commented 6 years ago

BINGO!  :)))

The fix for this issue will be committed shortly!

Fish-Git commented 6 years ago

Closed by commit e406feb212265230730a7c3079f29935507acf4a!

If you review the patch (the commit) it should hopefully be apparent what was causing the delay. ;-)

I should note too that this bug was unfortunately introduced due to my own inexperience with Linux. I apologize to the community for that. :(

I research things as best I can and then implement what I believe to be the correct code based on that research, but I sometimes get it wrong. This probably won't be the last time something like this happens either, and all I can say is that I hope you all will forgive me when it does happen, and help me to try and resolve whatever issue it may be. Thanks for your understanding.

Peter-J-Jansen commented 6 years ago

Absolutely super Fish! I also did a Google search but to me the results meant nothing.

I already sent an e-mail to the original person who experienced the wait, thinking it was caused by the CTCE device as that was the last he saw in the Hercules log/console, asking him to re-build and test and let us know.

Cheers,

Peter