SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
240 stars 90 forks source link

Default install of Regina rexx segfaults when LCS device is closed #163

Closed fbi-ranger closed 5 years ago

fbi-ranger commented 5 years ago

Trying to migrate to SDL Hercules 4.2, I face the problem that ending Hercules by entering 'quit' on the herc ====> command line, the program terminates with a segmentation fault:

HHC01603I quit                                                                              
HHC00101I Thread id 00007f9ee8a83700, prio -1, name 'Processor CP00' ended
HHC00101I Thread id 00007f9ee8c86700, prio -1, name 'Processor CP01' ended
HHC00101I Thread id 00007f9ee8b85700, prio -1, name 'Processor CP02' ended
HHC00101I Thread id 00007f9ee8881700, prio -1, name 'Processor CP03' ended
/usr/local/bin/iml-hercules: line 2:  9570 Segmentation fault      (core dumped) /local/sys1/z390/hercules/bin/hercules -f /local/sys1/z390/etc/hercules.cnf

  A correct quit looks like:

HHC00101I Thread id 00007f7b1e680700, prio -1, name 'Processor CP00' ended
HHC00101I Thread id 00007f7b1e883700, prio -1, name 'Processor CP01' ended
HHC00101I Thread id 00007f7b1e782700, prio -1, name 'Processor CP02' ended
HHC00101I Thread id 00007f7b1e47e700, prio -1, name 'Processor CP03' ended
HHC00417I 0:0150 CKD file /local/sys1/s390/mdasd/DOSRES.0150: cache hits 0, misses 0, waits 0
HHC00417I 0:0151 CKD file /local/sys1/s390/mdasd/SYSWK1.0151: cache hits 0, misses 0, waits 0
HHC00417I 0:1123 CKD file /local/sys1/s390/mdasd/240RES.1123: cache hits 0, misses 0, waits 0
HHC00417I 0:1124 CKD file /local/sys1/s390/mdasd/240W01.1124: cache hits 0, misses 0, waits 0
HHC00417I 0:1125 CKD file /local/sys1/s390/mdasd/240W02.1125: cache hits 0, misses 0, waits 0
HHC00417I 0:1126 CKD file /local/sys1/s390/mdasd/240W03.1126: cache hits 0, misses 0, waits 0
HHC00417I 0:1128 CKD file /local/sys1/s390/mdasd/240SPL.1128: cache hits 0, misses 0, waits 0
HHC00417I 0:1129 CKD file /local/sys1/s390/mdasd/240TMP.1129: cache hits 0, misses 0, waits 0
HHC00417I 0:0A80 CKD file /local/sys1/s390/mdasd/ZDRES1.0A80: cache hits 0, misses 0, waits 0
....
HC00417I 0:650A CKD file /local/sys1/s390/dasd/NMCT01.650A: cache hits 0, misses 0, waits 0
HHC00417I 0:650B CKD file /local/sys1/s390/dasd/NMCT02.650B: cache hits 0, misses 0, waits 0
HHC00417I 0:650C CKD file /local/sys1/s390/dasd/NMCT03.650C: cache hits 0, misses 0, waits 0
HHC00417I 0:6510 CKD file /local/sys1/s390/dasd/NBCC10.6510: cache hits 0, misses 0, waits 0
HHC00417I 0:6511 CKD file /local/sys1/s390/dasd/NBCC11.6511: cache hits 0, misses 0, waits 0
HHC00417I 0:651C CKD file /local/sys1/s390/dasd/NLXC04.651C: cache hits 0, misses 0, waits 0
HHC00417I 0:651D CKD file /local/sys1/s390/dasd/NLXC03.651D: cache hits 0, misses 0, waits 0
HHC00417I 0:651E CKD file /local/sys1/s390/dasd/NLXC02.651E: cache hits 0, misses 0, waits 0
HHC00417I 0:651F CKD file /local/sys1/s390/dasd/NLXC01.651F: cache hits 0, misses 0, waits 0
HHC00417I 0:6600 CKD file /local/sys1/s390/dasd/NSMS01.6600: cache hits 0, misses 0, waits 0
HHC00417I 0:6601 CKD file /local/sys1/s390/dasd/NSMS02.6601: cache hits 0, misses 0, waits 0
HHC00101I Thread id 00007f63810e8700, prio -1, name 'console_connect' ended
HHC01427I Main storage released
HHC01427I Expanded storage released
HHC01422I Configuration released
HHC00101I Thread id 00007f6380391700, prio -1, name 'logger_thread' ended
HHC01424I All termination routines complete
HHC01425I Hercules shutdown complete
HHC01412I Hercules terminated
HHC00101I Thread id 00007f63814fc700, prio -1, name 'timer_thread' ended
HHC00101I Thread id 00007f6384d07740, prio -1, name 'panel_display' ended

  Checking my configuration file, I have determined the following triggers the problem:

#
# LCS
#
 F00.2 LCS -n /dev/net/tun 10.0.0.191

  In cases where the LCS statement is active, the segmentation fault happens. If it is commented out then Hercules stops properly.

Doing a CCW trace on devices F00 and F01 shows the following:

HHC01603I t+f00                                                                             
HHC02204I CCW trace for 0:0F00 set to ON                                                    
HHC01603I t+f01                                                                             
HHC02204I CCW trace for 0:0F01 set to ON                                                    
HHC01603I quit                                                                              
HHC01420I Begin Hercules shutdown                                                           
HHC00101I Thread id 00007f1c89fff700, prio -1, name 'Processor CP00' ended
HHC00101I Thread id 00007f1c8a202700, prio -1, name 'Processor CP01' ended
HHC00101I Thread id 00007f1c8a101700, prio -1, name 'Processor CP02' ended
HHC00101I Thread id 00007f1c89dfd700, prio -1, name 'Processor CP03' ended
HHC00966I 0:0F00 CTC: lcs triggering port 00 event
/usr/local/bin/iml-hercules: line 2:  9687 Segmentation fault      (core dumped) /local/sys1/z390/hercules/bin/hercules -f /local/sys1/z390/etc/hercules.cnf
Fish-Git commented 5 years ago

Let me look into this, Florian, and I'll get back to you.

fbi-ranger commented 5 years ago

Thank you Fish. That is very kind of you.

But please do not forget it’s Christmas. Have a nice Christmas Eve.

Kind regards, Florian

Fish-Git commented 5 years ago

Thank you Fish. That is very kind of you.

Not at all! I have no life outside of Hercules and I couldn't sleep anyway. :)

But please do not forget it’s Christmas. Have a nice Christmas Eve.

Thank you. Same to you too!

Fish-Git commented 5 years ago

I am unable to reproduce this problem on my CentOS 6.10 system. I started hercules with an LCS device (and the device was successfully opened) and then immediately did a quit, and Hercules ended normally just like it always does for me. I tried it both as a regular user and as root too, and both times Hercules ended cleanly.

I did not actually try to IPL my guest however. Does the problem occur for you only when a guest operating system is IPLed and then later ended? Or does it occur just starting Hercules and then immediately quitting, without doing an IPL?

Also, have you checked your build log to ensure Hercules was built correctly? (without any errors or warnings).

Also, when you built your Hercules, did you build the External Packages beforehand? Or are you using the ones that come delivered with Hercules? If you didn't bother to build the External Packages for yourself, you might want to try doing that and then rebuilding Hercules afterwards. Perhaps that's where your problem is? What system are you running on anyway?

Fish-Git commented 5 years ago

One other thing too: you might want to use gdb to try and determine exactly where in Hercules it is crashing. I found a web page on stackoverflow.com that explains how to do it:

Basically, based on the very first reply of the above mentioned stackoverflow post, you should do:

$ gdb hercules
(gdb) run -f myhercconfig.cnf
<segfault happens here>
(gdb) backtrace
<offending code is shown here>

I would try it myself, but as I explained, I was unable to make Hercules crash. It works fine for me. But since it crashes for you, it would be very helpful if you could determine precisely where Hercules is crashing. Thanks!

fbi-ranger commented 5 years ago

No, I started Hercules and quit immediately. No OS was started before quit.

LINUX is openSuse 15 with latest fixes applied. Hercules 4.0.0 (the other Hyperion) runs with out any problem.

Config is now shirked to some DASDs in order to keep the log small. LCS is enabled (F00/F01).

I have started gdb:

(gdb) run -f z390/etc/hercmini.cnf 
Starting program: /local/sys1/z390/herc15001/bin/hercules -f z390/etc/hercmini.cnf
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C "debuginfo(build-id)=4062821f420b0c3d46ea03f208cae1a710516c4e"
Missing separate debuginfo for /lib64/librt.so.1
Try: zypper install -C "debuginfo(build-id)=a1a84d304e283d52e44332ec3fbf2f6f705bd5ff"
Missing separate debuginfo for /lib64/libresolv.so.2
Try: zypper install -C "debuginfo(build-id)=70404535e145645b599c469ea4476fc4c8357b03"
Missing separate debuginfo for /lib64/libm.so.6
Try: zypper install -C "debuginfo(build-id)=1e8038d58788ff7546c54ef151a441567c5119dc"
Missing separate debuginfo for /lib64/libdl.so.2
Try: zypper install -C "debuginfo(build-id)=466795ee7b9ca76122c66d034e7f18c7593d306e"
Missing separate debuginfo for /usr/lib64/libbz2.so.1
Try: zypper install -C "debuginfo(build-id)=78a5e01ade6b3d8db9bc9bcf7d7452c057ab7ac1"
Missing separate debuginfo for /lib64/libz.so.1
Try: zypper install -C "debuginfo(build-id)=9ca7a2b246871c3eeaa954a4a1315bbbbd335cc7"
Missing separate debuginfo for /lib64/libpthread.so.0
Try: zypper install -C "debuginfo(build-id)=f82798ed148c2a88dcddbaa67c838c824e1a43e9"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=95b799e45a989e22af6e9d31ec729170e2c92dd2"
[New Thread 0x7ffff5818700 (LWP 28233)]
HHC00109E set_thread_priority( 5 ) failed: Operation not permitted
HHC00007I Previous message from function 'impl' at impl.c(837)
HHC00110W Defaulting all threads to priority 1
HHC00007I Previous message from function 'impl' at impl.c(840)
HHC00100I Thread id 00007ffff7fc9740, prio -1, name 'impl_thread' started
HHC00100I Thread id 00007ffff5818700, prio -1, name 'logger_thread' started
HHC01413I Hercules version 4.2.0.0-SDL-gc8addaaf-modified (4.2.0.0)
HHC01414I (C) Copyright 1999-2018 by Roger Bowler, Jan Jaeger, and others
HHC01417I YBI-15001-9473
HHC01415I Build date: Dec 23 2018 at 22:50:08
HHC01417I Built with: GCC 7.3.1 20180323 [gcc-7-branch revision 258812]
HHC01417I Build type: GNU/Linux x86_64 host architecture build
HHC01417I Modes: S/370 ESA/390 z/Arch
HHC01417I Max CPU Engines: 12
HHC01417I Using   shared libraries
HHC01417I Using   setresuid() for setting privileges
HHC01417I Using   POSIX threads Threading Model
HHC01417I Using   Error-Checking Mutex Locking Model
HHC01417I With    Shared Devices support
HHC01417I With    Dynamic loading support
HHC01417I With    External GUI support
HHC01417I With    IPV6 support
HHC01417I With    HTTP Server support
HHC01417I With    sqrtl support
HHC01417I With    SIGABEND handler
HHC01417I With    CCKD BZIP2 support
HHC01417I With    HET BZIP2 support
HHC01417I With    ZLIB support
HHC01417I With    Regular Expressions support
HHC01417I Without Object REXX support
HHC01417I With    Regina REXX support
HHC01417I With    Automatic Operator support
HHC01417I Without National Language Support
HHC01417I With    CCKD64 Support
HHC01417I Machine dependent assists: cmpxchg1 cmpxchg4 cmpxchg8 hatomics=C11
HHC01417I Running on: hercules (Linux-4.12.14-lp150.12.28-default x86_64) MP=8
HHC01417I Built with decNumber external package version 3.68.0.79-g53f2512
HHC01417I Built with SoftFloat external package version 3.5.0.82-g1c66591
HHC01417I Built with telnet external package version 1.0.0.41-ged0ddec
HHC00018W Hercules is NOT running in elevated mode
HHC00007I Previous message from function 'impl' at impl.c(895)
Missing separate debuginfo for /lib64/libnss_files.so.2
Try: zypper install -C "debuginfo(build-id)=e71acc15f2935641bffdb8f78f0faef6f4c7acff"
HHC00150I Crypto module loaded (C) Copyright 2003-2016 by Bernard van der Helm
HHC01417I Built with crypto external package version 1.0.0.26-gefe199e
HHC00151I Activated facility: Message Security Assist
HHC00151I Activated facility: Message Security Assist Extension 1, 2, 3 and 4
Missing separate debuginfo for /lib64/libcrypt.so.1
Try: zypper install -C "debuginfo(build-id)=9a002f8c48735ff1fe0cbe4aa64b9a0cb4b2f84e"
HHC17528I REXX(Regina) VERSION: REXX-Regina_3.9.1 5.00 5 Apr 2015
HHC17529I REXX(Regina) SOURCE:  UNIX
HHC17525I REXX(Regina) Rexx has been started/enabled
HHC17500I REXX(Regina) Mode            : Command
HHC17500I REXX(Regina) MsgLevel        : Off
HHC17500I REXX(Regina) MsgPrefix       : Off
HHC17500I REXX(Regina) ErrPrefix       : Off
HHC17500I REXX(Regina) Resolver        : On
HHC17500I REXX(Regina) SysPath    ( 6) : On
HHC17500I REXX(Regina) RexxPath   ( 0) :
HHC17500I REXX(Regina) Extensions ( 8) : .REXX:.rexx:.REX:.rex:.CMD:.cmd:.RX:.rx
[New Thread 0x7ffff4bc8700 (LWP 28234)]
[New Thread 0x7ffff48c4700 (LWP 28235)]
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)
[New Thread 0x7ffff47c3700 (LWP 28236)]
HHC00100I Thread id 00007ffff47c3700, prio -1, name 'timer_thread' started
HHC00100I Thread id 00007ffff48c4700, prio -1, name 'Processor CP00' started
HHC00811I Processor CP00: architecture mode z/Arch
HHC02204I CPUSERIAL      set to 1BA2EF
HHC02204I CPUMODEL       set to 2827
HHC02204I MODEL          set to hardware(H20) capacity(H20) perm() temp()
HHC02204I PLANT          set to 01
HHC17003I MAIN     storage is 8G (mainsize); storage is not locked
[New Thread 0x7ffff4ac7700 (LWP 28237)]
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)
HHC00100I Thread id 00007ffff4ac7700, prio -1, name 'Processor CP01' started
HHC00811I Processor CP01: architecture mode z/Arch
[New Thread 0x7ffff49c6700 (LWP 28238)]
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)
HHC00100I Thread id 00007ffff49c6700, prio -1, name 'Processor CP02' started
HHC00811I Processor CP02: architecture mode z/Arch
[New Thread 0x7ffff46c2700 (LWP 28239)]
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)
HHC00100I Thread id 00007ffff46c2700, prio -1, name 'Processor CP03' started
HHC00811I Processor CP03: architecture mode z/Arch
HHC02204I NUMCPU         set to 4
HHC02204I MANUFACTURER   set to IBM
HHC02204I ARCHLVL        set to z/Arch
HHC00898W Facility( 044_PFPO ) *Enabled for z/Arch
HHC00007I Previous message from function 'facility_enable_disable' at facility.c(3438)
HHC02204I ECPSVM         set to disabled
HHC02204I LOADPARM       set to 
HHC02204I LPARNAME       set to SYSZ01
HHC02204I LPARNUM        set to 1
HHC02204I CPUIDFMT       set to 0
HHC02204I PANTITLE       set to z/VM 6.3 PHOENIX SYSRES 6300
HHC02204I SCPIMPLY       set to ON
HHC01474I Using internal codepage conversion table default
HHC02204I DIAG8CMD       set to ENABLE  NOECHO
HHC02204I PANRATE        set to SLOW
[New Thread 0x7ffff43af700 (LWP 28240)]
HHC00100I Thread id 00007ffff43af700, prio -1, name 'console_connect' started
HHC01024I Waiting for console connections on port 3270
HHC01250E 0:000C Card: error in function access(): No such file or directory
HHC00007I Previous message from function 'cardrdr_init_handler' at cardrdr.c(322)
HHC01463E 0:000C device initialization failed
HHC00007I Previous message from function 'attach_device' at config.c(1301)
[Detaching after fork from child process 28241]
HHC00901I 0:0F00 LCS: Interface tap0, type TAP opened
HHC00921I CTC: lcs device port 00: manual Multicast assist enabled
HHC00935I CTC: lcs device port 00: manual Checksum Offload enabled
[New Thread 0x7fffd77ca700 (LWP 28243)]
HHC01437I Config file[164] z390/etc/hercmini.cnf: including file /local/sys1/z390/etc/plxpex.cnf
HHC00414I 0:461A CKD file /local/sys1/s390/dasd/PLXPEA.461A: cyls 32760 heads 15 tracks 491400 trklen 56832
HHC00414I 0:461B CKD file /local/sys1/s390/dasd/PLXPEB.461B: cyls 32760 heads 15 tracks 491400 trklen 56832
HHC00414I 0:461C CKD file /local/sys1/s390/dasd/PLXPEC.461C: cyls 32760 heads 15 tracks 491400 trklen 56832
HHC00414I 0:461D CKD file /local/sys1/s390/dasd/PLXPED.461D: cyls 32760 heads 15 tracks 491400 trklen 56832
HHC01437I Config file[196] z390/etc/hercmini.cnf: including file /local/sys1/z390/etc/zvm630.cnf
HHC00414I 0:6300 CKD file /local/sys1/s390/mdasd/V631RS.6300: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6301 CKD file /local/sys1/s390/mdasd/V63RL1.6301: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6302 CKD file /local/sys1/s390/mdasd/V63CM1.6302: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6303 CKD file /local/sys1/s390/mdasd/V631S1.6303: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6304 CKD file /local/sys1/s390/mdasd/V631P1.6304: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6305 CKD file /local/sys1/s390/mdasd/V631W1.6305: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6306 CKD file /local/sys1/s390/mdasd/V631T1.6306: cyls 10017 heads 15 tracks 150255 trklen 56832
HHC00414I 0:6309 CKD file /local/sys1/s390/mdasd/V63SRC.6309: cyls 3339 heads 15 tracks 50085 trklen 56832
HHC00151I Activated facility: Message Security Assist                                      +
HHC00151I Activated facility: Message Security Assist Extension 1, 2, 3 and 4               
HHC17528I REXX(Regina) VERSION: REXX-Regina_3.9.1 5.00 5 Apr 2015                           
HHC17529I REXX(Regina) SOURCE:  UNIX                                                        
HHC17525I REXX(Regina) Rexx has been started/enabled                                        
HHC17500I REXX(Regina) Mode            : Command                                            
HHC17500I REXX(Regina) MsgLevel        : Off                                                
HHC17500I REXX(Regina) MsgPrefix       : Off                                                
HHC17500I REXX(Regina) ErrPrefix       : Off                                                
HHC17500I REXX(Regina) Resolver        : On                                                 
HHC17500I REXX(Regina) SysPath    ( 6) : On                                                 
HHC17500I REXX(Regina) RexxPath   ( 0) :                                                    
HHC17500I REXX(Regina) Extensions ( 8) : .REXX:.rexx:.REX:.rex:.CMD:.cmd:.RX:.rx            
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                            
HHC00100I Thread id 00007ffff47c3700, prio -1, name 'timer_thread' started                  
HHC00100I Thread id 00007ffff48c4700, prio -1, name 'Processor CP00' started                
HHC00811I Processor CP00: architecture mode z/Arch                                          
HHC02204I CPUSERIAL      set to 1BA2EF                                                      
HHC02204I CPUMODEL       set to 2827                                                        
HHC02204I MODEL          set to hardware(H20) capacity(H20) perm() temp()                   
HHC02204I PLANT          set to 01                                                          
HHC17003I MAIN     storage is 8G (mainsize); storage is not locked                          
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                            
HHC00100I Thread id 00007ffff4ac7700, prio -1, name 'Processor CP01' started                
HHC00811I Processor CP01: architecture mode z/Arch                                          
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                            
HHC00100I Thread id 00007ffff49c6700, prio -1, name 'Processor CP02' started                
HHC00811I Processor CP02: architecture mode z/Arch                                          
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                            
HHC00100I Thread id 00007ffff46c2700, prio -1, name 'Processor CP03' started                
HHC00811I Processor CP03: architecture mode z/Arch                                          
HHC02204I NUMCPU         set to 4                                                           
HHC02204I MANUFACTURER   set to IBM                                                         
HHC02204I ARCHLVL        set to z/Arch                                                      
HHC00898W Facility( 044_PFPO ) *Enabled for z/Arch                                          
HHC00007I Previous message from function 'facility_enable_disable' at facility.c(3438)      
HHC02204I ECPSVM         set to disabled                                                    
HHC02204I LOADPARM       set to                                                             
HHC02204I LPARNAME       set to SYSZ01                                                      
HHC02204I LPARNUM        set to 1                                                           
HHC02204I CPUIDFMT       set to 0                                                           
HHC02204I PANTITLE       set to z/VM 6.3 PHOENIX SYSRES 6300                                
HHC02204I SCPIMPLY       set to ON                                                          
HHC01474I Using internal codepage conversion table default                                  
HHC02204I DIAG8CMD       set to ENABLE  NOECHO                                              
HHC02204I PANRATE        set to SLOW                                                        
HHC00100I Thread id 00007ffff43af700, prio -1, name 'console_connect' started               
HHC01024I Waiting for console connections on port 3270                                      
HHC01250E 0:000C Card: error in function access(): No such file or directory                
HHC00007I Previous message from function 'cardrdr_init_handler' at cardrdr.c(322)           
HHC01463E 0:000C device initialization failed                                               
HHC00007I Previous message from function 'attach_device' at config.c(1301)                  
HHC00901I 0:0F00 LCS: Interface tap0, type TAP opened                                       
HHC00921I CTC: lcs device port 00: manual Multicast assist enabled                          
HHC00935I CTC: lcs device port 00: manual Checksum Offload enabled                          
HHC01437I Config file[164] z390/etc/hercmini.cnf: including file /local/sys1/z390/etc/plxpex
HHC00414I 0:461A CKD file /local/sys1/s390/dasd/PLXPEA.461A: cyls 32760 heads 15 tracks 4914
HHC00414I 0:461B CKD file /local/sys1/s390/dasd/PLXPEB.461B: cyls 32760 heads 15 tracks 4914
HHC00414I 0:461C CKD file /local/sys1/s390/dasd/PLXPEC.461C: cyls 32760 heads 15 tracks 4914
HHC00414I 0:461D CKD file /local/sys1/s390/dasd/PLXPED.461D: cyls 32760 heads 15 tracks 4914
HHC01437I Config file[196] z390/etc/hercmini.cnf: including file /local/sys1/z390/etc/zvm630
HHC00414I 0:6300 CKD file /local/sys1/s390/mdasd/V631RS.6300: cyls 10017 heads 15 tracks 150
HHC00414I 0:6301 CKD file /local/sys1/s390/mdasd/V63RL1.6301: cyls 10017 heads 15 tracks 150
HHC00414I 0:6302 CKD file /local/sys1/s390/mdasd/V63CM1.6302: cyls 10017 heads 15 tracks 150
HHC00414I 0:6303 CKD file /local/sys1/s390/mdasd/V631S1.6303: cyls 10017 heads 15 tracks 150
HHC00414I 0:6304 CKD file /local/sys1/s390/mdasd/V631P1.6304: cyls 10017 heads 15 tracks 150
HHC00414I 0:6305 CKD file /local/sys1/s390/mdasd/V631W1.6305: cyls 10017 heads 15 tracks 150
HHC00414I 0:6306 CKD file /local/sys1/s390/mdasd/V631T1.6306: cyls 10017 heads 15 tracks 150
HHC00414I 0:6309 CKD file /local/sys1/s390/mdasd/V63SRC.6309: cyls 3339 heads 15 tracks 5008

  Hercules started correctly. F00 is opened as TAP. F01 remains closed as probably no OS is started.

I enter quit:

HHC00101I Thread id 00007ffff48c4700, prio -1, name 'Processor CP00' ended
[Thread 0x7ffff48c4700 (LWP 28235) exited]
HHC00101I Thread id 00007ffff4ac7700, prio -1, name 'Processor CP01' ended
[Thread 0x7ffff4ac7700 (LWP 28237) exited]
HHC00101I Thread id 00007ffff49c6700, prio -1, name 'Processor CP02' ended
[Thread 0x7ffff49c6700 (LWP 28238) exited]
HHC00101I Thread id 00007ffff46c2700, prio -1, name 'Processor CP03' ended
[Thread 0x7ffff46c2700 (LWP 28239) exited]

Thread 10 "LCS_PortThread" received signal SIGUSR2, User defined signal 2.
[Switching to Thread 0x7fffd77ca700 (LWP 28243)]
0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0

(gdb) backtrace
#0  0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0
#1  0x00007fffd77cfa06 in LCS_PortThread (arg=arg@entry=0x7b9630) at ctc_lcs.c:2280
#2  0x00007ffff6dcae1d in hthread_func (arg2=0x7cea50) at hthreads.c:796
#3  0x00007ffff5bda559 in start_thread () from /lib64/libpthread.so.0
#4  0x00007ffff591181f in clone () from /lib64/libc.so.6
(gdb) 
fbi-ranger commented 5 years ago

Starting an OS (z/VM) works and also the LCS work correctly. However making an orderly shutdown and quit leads to the SEGMENTATION FAULT. So there is no difference between starting or not starting an OS.

I did also a devlist F00 and devlist F01 before starting the OS. Both commands show that the device is open.

fbi-ranger commented 5 years ago

Restarting Hercules after the SEGMENTATION FAULT leads to the following error messages during Hercules startup:

HHC00138E Error setting TUN/TAP mode : Interrupted system call
HHC00007I Previous message from function 'TUNTAP_CreateInterface' at tuntap.c(269)
HHC00900E 0:0F00 LCS: Error in function TUNTAP_CreateInterface: Unknown error -1
HHC00007I Previous message from function 'LCS_Init' at ctc_lcs.c(344)
HHC01463E 0:0F01 device initialization failed
HHC00007I Previous message from function 'attach_device' at config.c(1301)

  Exiting the user running Hercules and renewing the session (su - userid) works fine. The LCS devices are working again.

Fish-Git commented 5 years ago

Config is now shirked to some DASDs in order to keep the log small.

I'm guessing "shirked to some DASDs" means you've simply removed some of your dasd devices from your Hercules configuration file.

 

Thread 10 "LCS_PortThread" received signal SIGUSR2, User defined signal 2.
[Switching to Thread 0x7fffd77ca700 (LWP 28243)]
0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0

I'm guessing the segmentation fault occurred at that point, yes?

 

(gdb) backtrace
#0  0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0
#1  0x00007fffd77cfa06 in LCS_PortThread (arg=arg@entry=0x7b9630) at ctc_lcs.c:2280
#2  0x00007ffff6dcae1d in hthread_func (arg2=0x7cea50) at hthreads.c:796
#3  0x00007ffff5bda559 in start_thread () from /lib64/libpthread.so.0
#4  0x00007ffff591181f in clone () from /lib64/libc.so.6
(gdb) 

Hmmm... It looks to me like it's a bug in your system's tuntap device driver, not Hercules. Hercules is simply calling the close() function for the tuntap device, and for whatever reason, the call to close() is crashing. I don't think there's much we can do about that!

My suspicion that it's a bug in your system's tuntap device driver seems to be confirmed by your later comment:

Restarting Hercules after the SEGMENTATION FAULT leads to following error messages during start up of Hercules:

HHC00138E Error setting TUN/TAP mode : Interrupted system call
HHC00007I Previous message from function 'TUNTAP_CreateInterface' at tuntap.c(269)
HHC00900E 0:0F00 LCS: Error in function TUNTAP_CreateInterface: Unknown error -1
HHC00007I Previous message from function 'LCS_Init' at ctc_lcs.c(344)
HHC01463E 0:0F01 device initialization failed
HHC00007I Previous message from function 'attach_device' at config.c(1301)

Which indicates to me an obvious error/problem (i.e. "bug!") in your system's tuntap device driver. If Hercules is able to successfully open the tuntap device and set the mode during its previous attempts, but is unable to do the exact same thing on subsequent attempts (with the reported cause being "Interrupted system call" and "Unknown error -1"), then it sure appears to me as if your tuntap device is somehow borked (broken).

This is further confirmed by:

Exiting the user running Hercules and renewing the session (su - userid) works fine. The LCS devices are working again.

(where I'm presuming that: "exiting the user running Hercules and renewing the session" translates to: "I logged out of my userid (i.e. returned back to my system's login screen) and logged in again". Yes?)

Which would seem to indicate that doing so somehow managed to fix whatever problem there was with your tuntap device. (I'm guess the same fix would have occurred if you had rebooted your system too.) Because after you did that (logged out and then logged back in again), now your LCS devices are working again! (I presume that means Hercules is no longer crashing, yes? Please let me know if that's not true. Please let me know if the problem is still present, i.e. please let me know if Hercules is still crashing when you do a quit.)

Presuming I'm understanding you correctly, I'm going to mark this issue as "Unknown" (since it doesn't look like a Hercules bug) as well as "Close pending" until I hear back from you telling me whether my stated presumptions are correct or not.

(Very weird... Did you maybe apply some maintenance (system updates) that updated your tuntap driver and then forget to reboot? Or something similar? I know Linux is known for its stability and unnecessity to reboot so often, but maybe this is one instance where a reboot was required and you didn't do it? Hey! I'm just speculating! I'm not a Linux person!)

fbi-ranger commented 5 years ago

Well it seems really an issue with opensuse but stopping Hercules after a new start of the session leads again to the same error and it happens only with the LCS device. With OSA which I guess uses the same tun/tap the crash does not happen at all.

However it does also not happen with Hyperion 4.0.0 running on exactly this system. SDL Hyperion is installed in a different directory. The versions are also built on this system with the same options. So I can switch easily between them.

What I can try is to run Hercules to run as root to see if it is a privilege problem.

Currently I am installing z/VM 6.4 as second level system. This will take a day or so. I would like to see if it will IPL.

Fish-Git commented 5 years ago

Well it seems really an issue with opensuse but stopping Hercules after a new start of the session leads again to the same error and it happens only with the LCS device.

Dang! :(

Okay, then there's obviously still some problem (some unknown bug) somewhere in SDL Hyperion's LCS handler that seems to only impact some Linux distributions (e.g. OpenSUSE in your case), so I'm re-opening this issue again.

With OSA which I guess uses the same tun/tap the crash does not happen at all.

OSA devices (QETH) use tuntap in 'tun' mode. LCS devices use tuntap in 'tap' mode. So the problem (the bug), whatever it is, is only in the LCS handler's tap handling logic.

However it does also not happen with Hyperion 4.0.0 running on exactly this system.

Well then that would seem to imply a bug was indeed introduced somewhere in SDL Hyperion's LCS handler. It might be with my new offloading code that queries the tuntap device to try and determine whether certain hardware "offloads" are possible or not (e.g. checksum offloading for example). That logic is likely what is triggering this new unexpected/undesirable behavior on certain systems such as yours. I'll have to call in @mcisho and/or @ivan-w for help.

Guys?   HELP!   :(

mcisho commented 5 years ago

Sorry Fish, can't offer any help with this one. Using a Fedora 29 host, SDL & LCS works fine for z/VM 6.1 and z/OS 1.13 guests.

OSA devices (QETH) use tuntap in 'tun' mode.

Only with with layer 3, layer 2 uses tuntap in 'tap' mode.

Fish-Git commented 5 years ago

Sorry Fish, can't offer any help with this one. Using a Fedora 29 host, SDL & LCS works fine for z/VM 6.1 and z/OS 1.13 guests.

Oh well. Thanks anyway.

I guess this means there must be something unusual about the way Florian's tuntap device is defined/configured. We need to somehow determine what that "unusualness" is so we can get it fixed.

But it's going to be rather difficult to debug if we're unable to reproduce the problem. :(

OSA devices (QETH) use tuntap in 'tun' mode.

Only with with layer 3, layer 2 uses tuntap in 'tap' mode.

(Oops!) You're right. I forgot about that. Thanks for reminding me.

fbi-ranger commented 5 years ago

Well, I will reconfigure the system to see if I can run it without LCS. That is probably the simplest solution.

rgschmi commented 5 years ago

I'm also running opensuse 15.0 and SDL Hercules version 4.2.0.0-SDL-g6cab259d-modified (4.2.0.0) and LCS is running fine. My TAP is configured with bridging.

However, I do get the segmentation fault every time I exit Hercules, which isn't an issue for me. I don't think it's related, but I can't run Hercules as superuser unless I run it from the hyperion folder, and LCS require superuser. BTW these releases of opensuse and Hercules mark the first time I could run CTCE between two opensuse machines.

Fish-Git commented 5 years ago

However, I do get the segmentation fault every time I exit Hercules ...

Can you provide a gdb backtrace?

Fish-Git commented 5 years ago

However, I do get the segmentation fault every time I exit Hercules ...

Can you provide a gdb backtrace?

Info: https://github.com/SDL-Hercules-390/hyperion/issues/163#issuecomment-449731456

rgschmi commented 5 years ago

I will give it a try tomorrow.

rgschmi commented 5 years ago

Backtrace of segmentation fault exiting Hercules. There were errors in the .cnf file which may be causing the problem.

Thread 9 "quit_thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff443e700 (LWP 6535)]
0x00007ffff5d8e87e in pthread_join () from /lib64/libpthread.so.0
(gdb) backtrace
#0  0x00007ffff5d8e87e in pthread_join () from /lib64/libpthread.so.0
#1  0x00007ffff6dcb00b in hthread_join_thread (tid=0, prc=prc@entry=0x0, location=location@entry=0x7ffff4a80fa2 "sockdev.c:62") at hthreads.c:826
#2  0x00007ffff4a7f666 in term_sockdev (arg=<optimized out>) at sockdev.c:62
#3  0x00007ffff6dc4767 in hdl_atexit () at hdl.c:683
#4  0x00007ffff7777567 in do_shutdown_now () at hscmisc.c:140
#5  0x00007ffff777ab74 in do_shutdown () at hscmisc.c:211
#6  0x00007ffff77581c5 in quit_thread (arg=arg@entry=0x0) at hsccmd.c:462
#7  0x00007ffff6dc9b9d in hthread_func (arg2=0x646fb0) at hthreads.c:796
#8  0x00007ffff5d8d559 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ffff5ac481f in clone () from /lib64/libc.so.6
(gdb) 
Fish-Git commented 5 years ago

There were errors in the .cnf file which may be causing the problem.

What type of errors? May we see your Hercules configuration file and your Hercules log file?

Fish-Git commented 5 years ago

@rgschmi Bob: contact me off list (privately) and I'll try to help you with your VS2017 problem. Resolving your VS2017 problem in this GitHub Issue is not the proper place for it. (And neither is the other thread either!)

Fish-Git commented 5 years ago

Bob (@rgschmi) wrote:

Thread 9 "quit_thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff443e700 (LWP 6535)]
0x00007ffff5d8e87e in pthread_join () from /lib64/libpthread.so.0
(gdb) backtrace
#0  0x00007ffff5d8e87e in pthread_join () from /lib64/libpthread.so.0
#1  0x00007ffff6dcb00b in hthread_join_thread (tid=0, prc=prc@entry=0x0, location=location@entry=0x7ffff4a80fa2 "sockdev.c:62") at hthreads.c:826
#2  0x00007ffff4a7f666 in term_sockdev (arg=<optimized out>) at sockdev.c:62
... <snipped> ...

Interesting!

I hadn't noticed this before, but it appears the crash (SIGSEGV), at least for you, Bob, is occurring in sockdev.c's pthread_join() call, not in LCS code!

The gdb backtrace that Florian (@fbi-ranger) provided was for the SIGUSR2 signal that the LCS_PortThread receives as part of its normal close processing:

https://github.com/SDL-Hercules-390/hyperion/blob/6a1922ded5bcdb0db46be398d814e5179f8becc2/ctc_lcs.c#L792-L797

Thus I am inclined to believe the backtrace that Florian provided is actually an unintended "red herring" (i.e. false lead, i.e. misleading clue), and that the problem might actually be in our sockdev code, and not our LCS code as originally believed. That is to say, the sockdev bug, whatever it is, only happened to also impact LCS code too for some as-yet-unknown networking reason.

Maybe...

I'm not sure...

I'm just guessing at this point!

Florian? (@fbi-ranger) Are you also maybe using a sockdev device too like Bob obviously is? A socket printer perhaps? If you are, then that would lend weight to my theory. Please let us know whether you are also using a sockdev device or not?

In the mean time, Bob? (@rgschmi) Can you do me a favor and try again without any sockdev devices in your configuration? If it works (if no crash occurs), then that too would also lend weight to my theory.

Thanks!

fbi-ranger commented 5 years ago

Fish, No, sorry I do not have any other sockdev devices in my configuration.

For me this behavior looks like that a "subtask" (thread) compromises some internal control blocks and then the "close" of the LCS can not complete any more because its has a broken control structure. But this dates back from my old days of programmer having a crashing CICS program in front of me before we had Storage protection in CICS. ;-)

In my configuration the segfault happened only when LCS devices are part of the config.

Fish-Git commented 5 years ago

Fish, No, sorry I do not have any other sockdev devices in my configuration.

Dang. I thought maybe I was onto something. Oh well. :(

Can you please do another another gdb backtrace? The first one you provided was for a SIGUSR2 signal, which doesn't help. The SIGUSR2 is normal and doesn't tell us anything. I need to see the backtrace for the SIGSEGV, like what Bob provided.

I'm not familiar with gdb, but there is probably some way to tell gdb to "please ignore this signal and continue" whenever the SIGUSR2 occurs, which should hopefully lead to the eventual SIGSEGV, which is the event we need to see the backtrace for.

Thanks.

Fish-Git commented 5 years ago

I'm not familiar with gdb, but there is probably some way to tell gdb to "please ignore this signal and continue" whenever the SIGUSR2 occurs...

FYI:   I found the following:

It appears you can do either 1 or 2 (or both):

  1. press 'c' to continue whenever the SIGUSR2 break occurs.
  2. enter the gdb command "handle SIGUSR2 noprint nostop" when gdb is first started.

(or both)

  I hope that helps!

(I want to see if your SIGSEGV backtrace is the same as Bob's)

fbi-ranger commented 5 years ago

Fish, Thanks for helping me in gdb. Here is what's happening when Hercules started and immediately finish it entering quit:

HC00414I 0:6500 CKD file /local/sys1/s390/dasd/NBCSA1.6500: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6501 CKD file /local/sys1/s390/dasd/NBCSA2.6501: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6502 CKD file /local/sys1/s390/dasd/NBCSA3.6502: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6503 CKD file /local/sys1/s390/dasd/NBCSA4.6503: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6506 CKD file /local/sys1/s390/dasd/NBCC01.6506: cyls 1113 heads 15 tracks 16695
HHC00414I 0:6507 CKD file /local/sys1/s390/dasd/NBCC02.6507: cyls 1113 heads 15 tracks 16695
HHC00414I 0:6508 CKD file /local/sys1/s390/dasd/NBCC03.6508: cyls 1113 heads 15 tracks 16695
HHC00414I 0:650A CKD file /local/sys1/s390/dasd/NMCT01.650A: cyls 1113 heads 15 tracks 16695
HHC00414I 0:650B CKD file /local/sys1/s390/dasd/NMCT02.650B: cyls 1113 heads 15 tracks 16695
HHC00414I 0:650C CKD file /local/sys1/s390/dasd/NMCT03.650C: cyls 1113 heads 15 tracks 16695
HHC00414I 0:6510 CKD file /local/sys1/s390/dasd/NBCC10.6510: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6511 CKD file /local/sys1/s390/dasd/NBCC11.6511: cyls 10017 heads 15 tracks 1502
HHC00414I 0:651C CKD file /local/sys1/s390/dasd/NLXC04.651C: cyls 10017 heads 15 tracks 1502
HHC00414I 0:651D CKD file /local/sys1/s390/dasd/NLXC03.651D: cyls 10017 heads 15 tracks 1502
HHC00414I 0:651E CKD file /local/sys1/s390/dasd/NLXC02.651E: cyls 10017 heads 15 tracks 1502
HHC00414I 0:651F CKD file /local/sys1/s390/dasd/NLXC01.651F: cyls 10017 heads 15 tracks 1502
HHC00414I 0:6600 CKD file /local/sys1/s390/dasd/NSMS01.6600: cyls 30051 heads 15 tracks 4507
HHC00414I 0:6601 CKD file /local/sys1/s390/dasd/NSMS02.6601: cyls 30051 heads 15 tracks 4507
HHC01603I quit                                                                              
HHC00101I Thread id 00007ffff45c1700, prio -1, name 'http_server' ended
[Thread 0x7ffff45c1700 (LWP 12206) exited]
HHC00101I Thread id 00007ffff48c4700, prio -1, name 'Processor CP00' ended
[Thread 0x7ffff48c4700 (LWP 12201) exited]
HHC00101I Thread id 00007ffff4ac7700, prio -1, name 'Processor CP01' ended
[Thread 0x7ffff4ac7700 (LWP 12203) exited]
HHC00101I Thread id 00007ffff49c6700, prio -1, name 'Processor CP02' ended
[Thread 0x7ffff49c6700 (LWP 12204) exited]
HHC00101I Thread id 00007ffff46c2700, prio -1, name 'Processor CP03' ended
[Thread 0x7ffff46c2700 (LWP 12205) exited]

**Thread 11 "LCS_PortThread" received signal SIGSEGV, Segmentation fault.**
[Switching to Thread 0x7ffff41ad700 (LWP 12210)]
0x00007ffff4e5ff6f in ?? () from /usr/lib64/libregina.so

(gdb) backtrace 
#0  0x00007ffff4e5ff6f in ?? () from /usr/lib64/libregina.so
#1  0x00007ffff4e5ddcd in ?? () from /usr/lib64/libregina.so
#2  0x00007ffff4e1801d in ?? () from /usr/lib64/libregina.so
#3  <signal handler called>
#4  0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0
#5  0x00007fffd75cba06 in LCS_PortThread (arg=arg@entry=0x83b380) at ctc_lcs.c:2280
#6  0x00007ffff6dcad2d in hthread_func (arg2=0x850610) at hthreads.c:796
#7  0x00007ffff5bda559 in start_thread () from /lib64/libpthread.so.0
#8  0x00007ffff591181f in clone () from /lib64/libc.so.6
(gdb) 

Hope this helps you further.

Fish-Git commented 5 years ago

Interesting!

**Thread 11 "LCS_PortThread" received signal SIGSEGV, Segmentation fault.**
[Switching to Thread 0x7ffff41ad700 (LWP 12210)]
0x00007ffff4e5ff6f in ?? () from /usr/lib64/libregina.so

(gdb) backtrace 
#0  0x00007ffff4e5ff6f in ?? () from /usr/lib64/libregina.so
#1  0x00007ffff4e5ddcd in ?? () from /usr/lib64/libregina.so
#2  0x00007ffff4e1801d in ?? () from /usr/lib64/libregina.so
#3  <signal handler called>
#4  0x00007ffff5be3f2c in close () from /lib64/libpthread.so.0

It appears that for some unknown reason Regina REXX is crashing!

May I see the beginning of your Hercules logfile where Rexx is being loaded? E.g. It's the part of the logfile that looks similar to the following:

HHC17528I REXX(OORexx) VERSION: REXX-ooRexx_4.2.0(MT)_64-bit 6.04 22 Feb 2014
HHC17529I REXX(OORexx) SOURCE:  WindowsNT
HHC17525I REXX(OORexx) Rexx has been started/enabled
HHC17500I REXX(OORexx) Mode            : Subroutine
HHC17500I REXX(OORexx) MsgLevel        : Off
HHC17500I REXX(OORexx) MsgPrefix       : Off
HHC17500I REXX(OORexx) ErrPrefix       : Off
HHC17500I REXX(OORexx) Resolver        : On
HHC17500I REXX(OORexx) SysPath    (46) : On
HHC17500I REXX(OORexx) RexxPath   ( 0) :
HHC17500I REXX(OORexx) Extensions ( 8) : .REXX;.rexx;.REX;.rex;.CMD;.cmd;.RX;.rx

Some things to try:

  1. Before starting Hercules, define the environment variable HREXX_PACKAGE and set it to the value none. This should prevent Rexx from being loaded. Does the crash still occur?

  2. Try installing ooRexx instead of Regina Rexx, and set HREXX_PACKAGE to ooRexx. Does the crash still occur?

Thanks!

Fish-Git commented 5 years ago

2. ... ooRexx instead of Regina Rexx ...

Or ... in addition to Regina rexx.

That is to say, if you don't wish to uninstall Regina, you can still install ooRexx too (i.e. you can have both Rexxes installed at the same time and tell Hercules which one you want to use at runtime. Refer to the README.REXX document).

rgschmi commented 5 years ago

Fish,

I've tried to reproduce the segmentation fault, with and without sockdev devices to no avail. I am currently running OpenSUSE 15.0 and Hercules version 4.2.0.0-SDL-gcddb23fc-modified (4.2.0.0). As I mentioned in an email to you, I was NOT running an 'official' version of Hercules when I had the segmentation fault. I tried a formal shutdown of z/OS, a quiesce, and exiting Hercules with z/OS running, all with a printer and a 3390 connected via sockdev to no avail.

Fish-Git commented 5 years ago

... to no avail.

By "to no avail" I take it to mean you were unable to recreate the crash, correct? That is to say, your system (OpenSUSE 15.0, the same as what Florian is running) does not crash, regardless of whether you have an LCS device in your configuration or not and regardless of whether you have any sockdev devices or not, correct? In other words, you system always runs just fine, yes?

Fish-Git commented 5 years ago

Bob, do you have Regina Rexx installed on your system? If not, can you (temporarily?) install it and see whether that makes any difference or not? Florian has Regina Rexx installed on his system and it appears that's where the crash is occurring. I'm trying to determine (confirm or deny) whether or not it's Regina Rexx that is causing the crash. Thanks!

Fish-Git commented 5 years ago

@fbi-ranger , @rgschmi (Florian and Bob)

It appears both of you are running fairly old versions of SDL Hyperion 4.2.

It would be very helpful if both of you would do a git pull to pick up the latest and greatest version and try again. I want to make sure I (we) haven't been wasting time chasing a non-existent bug!

fbi-ranger commented 5 years ago

Fish,

Here are the the REXX initialization messages:

HHC17528I REXX(Regina) VERSION: REXX-Regina_3.9.1 5.00 5 Apr 2015                                    
HHC17529I REXX(Regina) SOURCE:  UNIX                                                                 
HHC17525I REXX(Regina) Rexx has been started/enabled                                                 
HHC17500I REXX(Regina) Mode            : Command                                                     
HHC17500I REXX(Regina) MsgLevel        : Off                                                         
HHC17500I REXX(Regina) MsgPrefix       : Off                                                         
HHC17500I REXX(Regina) ErrPrefix       : Off                                                         
HHC17500I REXX(Regina) Resolver        : On                                                          
HHC17500I REXX(Regina) SysPath    ( 6) : On                                                          
HHC17500I REXX(Regina) RexxPath   ( 0) :                                                             
HHC17500I REXX(Regina) Extensions ( 8) : .REXX:.rexx:.REX:.rex:.CMD:.cmd:.RX:.rx                     

  Setting HREXX_PACKAGE=none does not help at all. The crash still occurs:

Thread 11 "LCS_PortThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff41a8700 (LWP 3036)]
0x00007ffff4e5af6f in ?? () from /usr/lib64/libregina.so
(gdb) backtrace 
#0  0x00007ffff4e5af6f in ?? () from /usr/lib64/libregina.so
#1  0x00007ffff4e58dcd in ?? () from /usr/lib64/libregina.so
#2  0x00007ffff4e1301d in ?? () from /usr/lib64/libregina.so
#3  <signal handler called>
#4  0x00007ffff5bdef2c in close () from /lib64/libpthread.so.0
#5  0x00007fffd75cab86 in LCS_PortThread (arg=0x843380) at ctc_lcs.c:2280
#6  0x00007ffff6dadd32 in hthread_func (arg2=0x869890) at hthreads.c:797
#7  0x00007ffff5bd5559 in start_thread () from /lib64/libpthread.so.0
#8  0x00007ffff590c81f in clone () from /lib64/libc.so.6

Since the ./configure '--disable-regina-rexx' option is not working, I had to uninstall Regina REXX:

HC00109E set_thread_priority( 5 ) failed: Operation not permitted                                   
HHC00007I Previous message from function 'impl' at impl.c(848)                                       
HHC00110W Defaulting all threads to priority 1                                                       
HHC00007I Previous message from function 'impl' at impl.c(851)                                       
HHC00100I Thread id 00007fd5e17aa740, prio -1, name 'impl_thread' started                            
HHC00100I Thread id 00007fd5deff8700, prio -1, name 'logger_thread' started                          
HHC01413I Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0)                                  
HHC01414I (C) Copyright 1999-2019 by Roger Bowler, Jan Jaeger, and others                            
HHC01417I YBI-15007-9623                                                                             
HHC01415I Build date: Mar 31 2019 at 15:44:48                                                        
HHC01417I Built with: GCC 7.3.1 20180323 [gcc-7-branch revision 258812]                              
HHC01417I Build type: GNU/Linux x86_64 host architecture build                                       
HHC01417I Modes: S/370 ESA/390 z/Arch                                                                
HHC01417I Max CPU Engines: 12                                                                        
HHC01417I Using   shared libraries                                                                   
HHC01417I Using   setresuid() for setting privileges                                                 
HHC01417I Using   POSIX threads Threading Model                                                      
HHC01417I Using   Error-Checking Mutex Locking Model                                                 
HHC01417I With    Shared Devices support                                                             
HHC01417I With    Dynamic loading support                                                            
HHC01417I With    External GUI support                                                               
HHC01417I With    IPV6 support                                                                       
HHC01417I With    HTTP Server support                                                                
HHC01417I With    sqrtl support                                                                      
HHC01417I With    SIGABEND handler                                                                   
HHC01417I With    CCKD BZIP2 support                                                                 
HHC01417I With    HET BZIP2 support                                                                  
HHC01417I With    ZLIB support                                                                       
HHC01417I With    Regular Expressions support                                                        
**HHC01417I Without Object REXX support                                                                
HHC01417I Without Regina REXX support**                                                                
HHC01417I With    Automatic Operator support                                                         
HHC01417I Without National Language Support                                                          
HHC01417I With    CCKD64 Support                                                                     
HHC01417I Machine dependent assists: cmpxchg1 cmpxchg4 cmpxchg8 hatomics=C11                         
HHC01417I Running on: hercules (Linux-4.12.14-lp150.12.48-default x86_64) MP=8                       
HHC01417I Built with crypto external package version 1.0.0.27-ga3e07b5                               
HHC01417I Built with decNumber external package version 3.68.0.80-gdb5c456                           
HHC01417I Built with SoftFloat external package version 3.5.0.83-g3da230f                            
HHC01417I Built with telnet external package version 1.0.0.42-gcaec0ac                               
HHC00018W Hercules is NOT running in elevated mode                                                   
HHC00007I Previous message from function 'impl' at impl.c(906)                                       
HHC00150I Crypto module loaded (C) Copyright 2003-2016 by Bernard van der Helm                       
HHC00151I Activated facility: Message Security Assist                                                
HHC00151I Activated facility: Message Security Assist Extension 1, 2, 3 and 4                        
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                                     
HHC00100I Thread id 00007fd5de5b6700, prio -1, name 'Processor CP00' started                         
HHC00811I Processor CP00: architecture mode z/Arch                                                   
HHC00100I Thread id 00007fd5de4b5700, prio -1, name 'timer_thread' started                           
HHC02204I CPUSERIAL      set to 1BA2EF                                                               
HHC02204I CPUMODEL       set to 2827                                                                 
HHC02204I MODEL          set to hardware(H20) capacity(H20) perm() temp()                            
HHC02204I PLANT          set to 01                                                                   
HHC17003I MAIN     storage is 8G (mainsize); storage is not locked                                   
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                                     
HHC00100I Thread id 00007fd5de7b9700, prio -1, name 'Processor CP01' started                         
HHC00811I Processor CP01: architecture mode z/Arch                                                   
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                                     
HHC00100I Thread id 00007fd5de6b8700, prio -1, name 'Processor CP02' started                         
HHC00811I Processor CP02: architecture mode z/Arch                                                   
HHC00111I Thread CPU Time IS available (_POSIX_THREAD_CPUTIME=0)                                     
HHC00100I Thread id 00007fd5de3b4700, prio -1, name 'Processor CP03' started                         
HHC00811I Processor CP03: architecture mode z/Arch                                                   
HHC02204I NUMCPU         set to 4                                                                    
HHC02204I MANUFACTURER   set to IBM                                                                  
HHC02204I ARCHLVL        set to z/Arch                                                               
HHC02204I ECPSVM         set to disabled                                                             
HHC02204I LOADPARM       set to                                                          
HHC02204I LOADPARM       set to                                                                         +
HHC02204I LPARNAME       set to SYSZ01                                                                   
HHC02204I LPARNUM        set to 1                                                                        
HHC02204I CPUIDFMT       set to 0                                                                        
HHC02204I PANTITLE       set to z/VM 6.3 PHOENIX SYSRES 6300                                             
HHC02204I SCPIMPLY       set to ON                                                                       
HHC01474I Using internal codepage conversion table default                                               
HHC02204I DIAG8CMD       set to ENABLE  NOECHO                                                           
HHC02204I PORT           set to port=8081 auth userid<herc> password<do1t>                               
HHC01807I HTTP server signaled to start                                                                  
HHC02204I PANRATE        set to SLOW                                                                     
HHC00100I Thread id 00007fd5de2b3700, prio -1, name 'http_server' started                                
HHC01802I HTTP server using root directory /local/sys1/z390/herc15007/share/hercules/                    
HHC01803I HTTP server waiting for requests on port 8081                                                  
HHC00100I Thread id 00007fd5ddfa0700, prio -1, name 'console_connect' started                            
HHC01024I Waiting for console connections on port 3270                                                   
HHC01250E 0:000C Card: error in function access(): No such file or directory                             
HHC00007I Previous message from function 'cardrdr_init_handler' at cardrdr.c(322)                        
HHC01463E 0:000C device initialization failed                                                            
HHC00007I Previous message from function 'attach_device' at config.c(1301)                               
HHC00901I 0:0F02 LCS: Interface tap0, type TAP opened                                                    
HHC00921I CTC: lcs device port 00: manual Multicast assist enabled                                       
HHC00935I CTC: lcs device port 00: manual Checksum Offload enabled                                       
HHC00224I 0:0760 Tape file *, type aws: display "        "                                               
HHC00224I 0:0761 Tape file *, type aws: display "        "                                               
HHC00224I 0:0D00 Tape file *, type aws: display "        "                                               
HHC00224I 0:0D01 Tape file *, type aws: display "        "                                               
HHC00224I 0:0D80 Tape file *, type aws: display "        "                                               
HHC00224I 0:0D81 Tape file *, type aws: display "        "

  The good news is that without Regina REXX installed, the Segment Fault does not happen anymore!

However, it is still a problem whenever Regina Rexx is installed. I don't understand what has Rexx to do with LCS / CTC support? I never used Rexx together with Hercules yet, so only the libraries were linked together.

Fish-Git commented 5 years ago

Setting HREXX_PACKAGE=none does not help at all. The crash still occurs:

Dang! :(

Since the ./configure '--disable-regina-rexx' option is not working ...

Wow. I wasn't aware of that! That's definitely a bug. I'll try to get that fixed for you right away!

... I had to uninstall Regina REXX.

The good news is that without Regina REXX installed, the Segment Fault does not happen anymore!

Which is proof that Regina Rexx is definitely the cause of the problem! WHY, I haven't a clue.

However, it is still a problem whenever Regina Rexx is installed.

Understood. I can't remember having any problems myself when I was testing Hercules Regina Rexx support on my CentOS 6.10 VMware virtual machine, but I might not have tried it with an LCS device defined. I'll have to try it again.

I don't understand what has Rexx to do with LCS / CTC support?

I don't understand it either! It's very weird! It doesn't make any sense!

But during all of my Hercules REXX testing I had nothing but problems with Regina Rexx. It's a POS in my opinion. It's very buggy and very poorly documented. OORexx (Open Object Rexx) on the other hand, is much more stable and much better documented as well. It's just a much better product in my opinion, and is what I have installed on both my Windows host system as well as on both of my CentOS and Macintosh virtual machines too.

I am seriously considering dropping support for Regina Rexx altogether at this point!  >8-<

IN SUMMARY: hang loose for a day or two(?) while I try to fix the '--disable-regina-rexx' configure bug. I'll let you know when the fix is commited so you can then build your Hercules without Regina rexx support from now on.

p.s. Have you tried installing ooRexx yet?

mcisho commented 5 years ago

I have Regina Rexx (and ooRexx) installed, and I have not experienced the Segment Fault. Also, ./config --disable-regina-rexx (and --disable-object-rexx) work fine for me.

Fish-Git commented 5 years ago

I have Regina Rexx (and ooRexx) installed, and I have not experienced the Segment Fault. Also, ./config --disable-regina-rexx (and --disable-object-rexx) work fine for me.

Thanks for that report, Ian! As I mentioned in my previous comment I too had both installed (as well as only one or the other too) during my testing and don't recall experiencing any crashes. But then I can't remember whether any of the tests I did were done with an LCS device in my configuration either, so I guess that doesn't mean much.

The problem may well be limited to openSuse 15 however. I don't know. I haven't heard back from Bob yet who I believe is also running openSuse 15.

What distro are you using, Ian?

mcisho commented 5 years ago

I'm using Fedora 29.

I have the opposite view of the Rexxes. I much prefer Regina, it's closer to Cowlishaw's vision, and isn't full of weird extensions. And I find the Regina manual easy to use and follow. Admittedly, I've never tried using Regina (or ooRexx) with Hercules, haven't had any need for either.

In one of the traces wasn't there a signal between a close and Regina becoming involved? LCS does do a SIGUSR2, perhaps it was being intercepted by Regina's signal handlers?

Fish-Git commented 5 years ago

EDIT: This comment is BOGUS!     (Doh!)

Version 0f1b54b5... is David Durand's pull request that I merged which I hadn't pulled into my local repository yet! (Doh!)

Version 0f1b54b5... is the most current version!

My bad! Sorry!   :(


@fbi-ranger

FLORIAN!

IMPORTANT!

The version of SDL Hyperion 4.2 that you are using IS BOGUS!!

According to your Hercules logfile that you posted:

HHC01413I Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0)

you are using a BOGUS/UNKNOWN version!!

The git hash "0f1b54b5..." DOES NOT EXIST anywhere in the official SDL Hyperion 4.2 repository's commit history! I have no idea where you got your version of Hercules from, but it is bad! (bogus!)

Please delete your SDL Hercules installation and clone the official SDL Hyperion version 4.2 from GitHub:

and then rebuild and try your test again!

Fish-Git commented 5 years ago

I much prefer Regina, it's closer to Cowlishaw's vision, and isn't full of weird extensions. And I find the Regina manual easy to use and follow.

To each their own. :)

 

In one of the traces wasn't there a signal between a close and Regina becoming involved?

Yes, I saw that too.

LCS does do a SIGUSR2, perhaps it was being intercepted by Regina's signal handlers?

It wouldn't surpise me in the least!

However... the SIGUSR2 signal should be being (consumed?) (ignored?) in our signal handling function sigabend_handler, so it shouldn't be being passed on to Regina, yes?

(I'm not very experienced with, nor knowledgable about, Unix signal handling!)

The sigabend_handler function just returns whenever SIGUSR2 is received, which ignores (consumes?) the signal, yes? In order to "pass it on" you need to do:

        signal( signo, SIG_DFL );
        raise( signo );

which sigabend_handler is not doing for SIGUSR2. So if I'm understanding Unix signal handling correctly, Regina should not even be receiving the signal at all! Yes? Why the gdb backtrace shows it I don't know. I'm not a Linux person.

rgschmi commented 5 years ago

I've just installed rexx (Regina because I've used it in Windows), and will git and build the latest Hercules. I've not used rexx with Hercules yet, though it's on my to-do list. Is there a simple rexx test I can run for you?

rgschmi commented 5 years ago

I've installed Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0) and am getting seg faults every time I exit Hercules, with or without sockdev devices or without even starting a guest z/OS.

I do have Regina rexx enabled, but didn't exec any rexx scripts.

It fails even without LCS devices defined.

This was not happening with the previous version of Hercules. I was unable to get a segfault with anything I tried, with or without sockdev.

HHC01413I Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0)                                                                               
HHC01414I (C) Copyright 1999-2019 by Roger Bowler, Jan Jaeger, and others                                                                         
HHC01417I ** The SoftDevLabs version of Hercules **                                                                                               
HHC01415I Build date: Mar 31 2019 at 15:29:10                                                                                                     
HHC01417I Built with: GCC 7.3.1 20180323 [gcc-7-branch revision 258812]                                                                           
HHC01417I Build type: GNU/Linux x86_64 host architecture build                                                                                    
HHC01417I Modes: S/370 ESA/390 z/Arch                                                                                                             
HHC01417I Max CPU Engines: 64                                                                                                                     
HHC01417I Using   shared libraries                                                                                                                
HHC01417I Using   setresuid() for setting privileges                                                                                              
HHC01417I Using   POSIX threads Threading Model                                                                                                   
HHC01417I Using   Error-Checking Mutex Locking Model                                                                                              
HHC01417I With    Shared Devices support                                                                                                          
HHC01417I With    Dynamic loading support                                                                                                         
HHC01417I With    External GUI support                                                                                                            
HHC01417I With    IPV6 support                                                                                                                    
HHC01417I With    HTTP Server support                                                                                                             
HHC01417I With    sqrtl support                                                                                                                   
HHC01417I With    SIGABEND handler                                                                                                                
HHC01417I Without CCKD BZIP2 support                                                                                                              
HHC01417I Without HET BZIP2 support                                                                                                               
HHC01417I With    ZLIB support                                                                                                                    
HHC01417I With    Regular Expressions support                                                                                                     
HHC01417I Without Object REXX support                                                                                                             
HHC01417I With    Regina REXX support                                                                                                             
HHC01417I With    Automatic Operator support                                                                                                      
HHC01417I Without National Language Support                                                                                                       
HHC01417I With    CCKD64 Support                                                                                                                  
HHC01417I Machine dependent assists: cmpxchg1 cmpxchg4 cmpxchg8 hatomics=C11                                                                      
HHC01417I Running on: Suse1 (Linux-4.12.14-lp150.12.45-default x86_64) MP=2                                                                       
HHC01417I Built with crypto external package version 1.0.0.27-ga3e07b5                                                                            
HHC01417I Built with decNumber external package version 3.68.0.80-gdb5c456                                                                        
HHC01417I Built with SoftFloat external package version 3.5.0.83-g3da230f                                                                         
HHC01417I Built with telnet external package version 1.0.0.42-gcaec0ac                                                                            
HHC01603I exit                                                                                                                                    
Segmentation fault (core dumped)
root@Suse1:/home/rgschmi> 
Fish-Git commented 5 years ago

@fbi-ranger

Florian, please IGNORE my previous comment. The version you are using (0f1b54b5...) is indeed the correct version. I apologize for the confusion!   :(

Fish-Git commented 5 years ago

But during all of my Hercules REXX testing I had nothing but problems with Regina Rexx. It's a POS in my opinion. It's very buggy and very poorly documented. OORexx (Open Object Rexx) on the other hand, is much more stable and much better documented as well.

I have the opposite view of the Rexxes. I much prefer Regina, it's closer to Cowlishaw's vision, and isn't full of weird extensions. And I find the Regina manual easy to use and follow.

FYI: ooRexx is IBM's Rexx, whereas Regina Rexx is not:

And:

[fish@centos-64 ~]$ rexx -v

Open Object Rexx Version 4.2.0
Build date: Dec 31 2013
Addressing Mode: 64

Copyright (c) IBM Corporation 1995, 2004.
Copyright (c) RexxLA 2005-2013.
All Rights Reserved.
This program and the accompanying materials are made available under
the terms of the Common Public License v1.0 which accompanies this
distribution or at
http://www.oorexx.org/license.html

[fish@centos-64 ~]$ 

Regina Rexx is maintained by Mark Hessling, not IBM.

I personally have had nothing but problems with Regina wherease I have hardly had any problems at all with ooRexx. It just works.   (whereas Regina Rexx frequently doesn't!)

Fish-Git commented 5 years ago

@rgschmi @fbi-ranger

Bob: Florian:

How did you "install" Regina?

I seem to recall that you need to install the Regina-REXX-lib rpm first, then the Regina-REXX-devel rpm, and finally the Regina-REXX rpm last. (And then build Hercules.)

Also, does Hercules behave any differently when you start rexx manually before attempting to start Hercules for the first time? That is to say, after logging on to your Linux session (userid), enter the command rexx -v (or rexx --version?) and then, afterwards, try starting Hercules for the first time.

I seem to recall Regina (as well as ooRexx too?) comes with a daemon that must be running before rexx will behave properly, and the daemon is not started until you run rexx for the first time.

I have the command rexx -v in my bash profile so the daemon is automatically started every time I logon.

fbi-ranger commented 5 years ago

Under openSUSE you install normally via YAST, which means, you do not decide which rpm is chosen first as the sequence depends on the rpm install list (dependencies).

To what I understand is the daemon used to get access to rexx queues and at least AFAIK is automatically integrated in the startup procedures. I don't know it this has any other function but the rxqueue(s).

It is registered in the service manager and normally starts automatically (without any rexx invocation)

 ps -ef | grep rx 
root     26289     1  0 18:26 ?        00:00:00 /usr/bin/rxstack -d

# rexx -v
rexx: REXX-Regina_3.9.1 5.00 5 Apr 2015 (64 bit)

The reason I use Regina REXX is that it has a convenient way of putting outputs from system commands to a queue, which ooRexx does not have or at least didn't have when I was trying to play with it many years ago.

I agree with Ian that Regina REXX is more close to the original REXX. As I am coming from the mainframe, I wanted as much as possible the same functionality as under VM/CMS and not having weird constructions via temporary files etc.

Maybe this has meanwhile changed in ooREXX, I don't know. I use under LINUX now PERL instead of REXX. Therefore I could easily relinquish it, when it solves the LCS problem. However this is surely not an acceptable solution.

Regarding invocation of REXX libraries: Couldn't it be the problem that all libraries are statically linked during link phase of Hercules install process? Maybe that is why even I do not use REXX together with Hercules they are loaded and active?

Fish-Git commented 5 years ago

Under openSUSE you install normally via YAST, which means, you do not decide which rpm is chosen first as the sequence depends on the rpm install list (dependencies).

As you know, I do not know a lot about Linux. I am relatively inexpeienced. But as far as I know, a default install of rexx only installs the components necessary to use rexx, i.e. it only installs the components to be able to run (execute) rexx scripts.

But as far as I know a default install does not install the components needed to do rexx development, i.e. it does not install the components needed to be able to write programs that call directly into rexx internal functions, etc, i.e. it does not install the components needed to link your program with rexx itself, so your program can execute rexx scripts by directly calling into internal rexx functions like the way Hercules does. (Hercules does not fork a separate process to run rexx scripts. It calls internal rexx functions to ask it to "please execute this script" and passes statements to it, etc.) To do rexx development like the way Hercules needs to do, you have to install the "lib" and "devel" packages. Hercules does not call rexx externally. Instead, rexx support is integrated directly into Hercules. Thus the need for the "lib" and "devel" packages.

The reason I use Regina REXX is that it has a convenient way of putting outputs from system commands to a queue, which ooRexx does not have or at least didn't have when I was trying to play with it many years ago.

Is placing the output from system commands into a queue something that the rexx language supports? Is doing that part of the language? Is doing that part of standard rexx? If the answer is yes, then I'm sure ooRexx supports it! And I'm sure it supports it in the standard language-defined way too!

The rexx language is well defined, defining exactly how each command, statement, function is supposed to behave. If placing the output of system commands into a queue is something that the rexx language defines (if doing that is something supported by the rexx language), then any product claiming to be a rexx interpreter must obviously support it (and support it in the defined manner) or else that product cannot be called a valid rexx language interpreter!

Given that ooRexx is written and copyrighted by IBM themselves, I personally would trust ooRexx more than I would Regina! (which is not an official IBM product and thus far less likely to conform to the original Rexx language which was invented by IBM).

I would personally trust IBM themselves to write a rexx interpreter that conformed to the original Cowlishaw Rexx language (who is an IBM Fellow and who invented Rexx while working at IBM!) than I would someone like Anders Christensen or Mark Hessling, who as far as I know did not work at IBM.

I agree with Ian that Regina REXX is more close to the original REXX.

How can a product not written by IBM be "closer to the original REXX" than a product that was written by IBM? (especially given that the "original REXX" was something that was written by IBM!)

How can a non-IBM product conform better to an IBM product better than IBM's own product?! That doesn't make any sense!

As I am coming from the mainframe, I wanted as much as possible the same functionality as under VM/CMS and not having weird constructions via temporary files etc.

I am not familiar with these "weird constructions via temporary files" that you mention. What are you talking about?

And as far as I know, both Regina and ooRexx both "provide the same functionality as under VM/CMS".

Is there something that ooRexx does that is not the same as under VM/CMS?? I doubt it! I am very confident that ooRexx -- an IBM product! -- provides the same functionality as under VM/CMS!

Regarding invocation of REXX libraries: Couldn't it be the problem that all libraries are statically linked during link phase of Hercules install process?

Rexx support is linked into Hercules by default, yes, but only if the needed headers (and libraries?) are found.

Maybe that is why even I do not use REXX together with Hercules they are loaded and active?

If you have Rexx installed (Regina OR ooRexx (or both!)) then Hercules will build itself with Rexx support.

Perhaps our default should be to NOT provide Rexx support by default? I.e. perhaps Hercules rexx support should only be provided by specific request? (e.g. via a --ENABLE-regina-rexx configure option?) That is perhaps something we could discuss elsewhere.

For the purpose of this issue however (which is trying to fix your crash when LCS devices are used when Regina Rexx is installed), have you tried uninstalling Regina, and then manually installing ALL of the Regina packages, including the "lib" and "devel" packages too? (which I believe must be installed first, then the normal/default Regina package last).

Maybe if you do that it won't crash any more? Maybe? I don't know. It's just wishful thinking.

Personally I believe Regina has either a race condition or, more likely, is erroneously (incorrectly) receiving and processing Hercules's SIGUSR2 signal, and that is what is causing it to crash: a poorly (improperly) written non-IBM rexx interpreter.

Fish-Git commented 5 years ago

@rgschmi wrote:

I've installed Hercules version 4.2.0.0-SDL-g0f1b54b5-modified (4.2.0.0) and am getting seg faults every time I exit Hercules, with or without sockdev devices or without even starting a guest z/OS.

I do have Regina rexx enabled, but didn't exec any rexx scripts.

It fails even without LCS devices defined.

This was not happening with the previous version of Hercules. I was unable to get a segfault with anything I tried, with or without sockdev.

  Bob:

Have you tried:

  1. Uninstalling Regina and then rebuilding Hercules? Based on Florian's test, uninstalling Regina prevents Hercules from crashing.

(and/or)

  1. Build Hercules using the ./configure option: --disable-regina-rexx to prevent Hercules from trying to use Regina/rexx?

Either of those techniques should prevent Hercules from crashing if Regina is indeed the culprit (which it appears it is).

Fish-Git commented 5 years ago

I do have Regina rexx enabled, but didn't exec any rexx scripts.

Does not matter. If Regina is installed, Hercules, by default, will be built with rexx support. You do not have to actually use Hercules's rexx support, but it will be there (which is apparently the problem) (at least with Regina anyway; ooRexx AFAIK doesn't have this problem).

Fish-Git commented 5 years ago

FYI to those who still prefer Regina over IBM's own ooRexx product (in case you missed it):

(Excerpt):

... it's definitely Hercules's logger_init redirection logic that is confusing poor Regina. OORexx works flawlessly, with or without redirection, but Regina unfortunately doesn't.

Regina sucks!

ooRexx rocks!

(IMHO)

Fish-Git commented 5 years ago

More evidence of Regina's bugginess/inferiority:

https://github.com/SDL-Hercules-390/hyperion/commit/46b278269351813a5f49a7d3e03ea63fc413a4b9#diff-b1d4bab4078e65d7bddbcea1a8b9980b