Closed cfdonatucci closed 2 years ago
A couple of things right off the bat:
I would prefer seeing a complete Hercules configuration file rather than just a single device statement. There could be other valuable information in there that might help explain things.
I would prefer to see a complete Hercules log file (from beginning to end) rather than the small extract your provided. There could be valuable information in the log that might help explain what's going on (or what happened).
You said "I also ran TTTest64.exe successfully", but simply starting TTTest64 and then not doing anything is not enough. After starting TTTest64 you need to actually perform a test! More specifically, a "Multi-directional Ping Test", as explained in the CTCI-WIN documentation (Help file). You also failed to show us your TTTest64 test output too. It contains valuable information about your host networking setup.
On your Hercules device statement, you specified netmask 255.255.0.0
, but in your z/OS TCP config you specified 255.255.255.0
.
Most of these things are mentioned in our "SUBMITTING PROBLEM REPORTS" document. Please review it and then submit all needed additional information. Thanks.
In the mean time I'll take a peek at your dump to see whether anything jumps out at me.
Dump was worthless.
I need to see the complete Hercules log file, configuration file, and TTTest64 output.
Explaining in more detail what you were doing (or trying to do) when the problem occurred might help too.
p.s. Did you disable Unconstrained Transactions (i.e. run TXOFF) before doing whatever it was you were attempting to do, like you did previously?
Hi, thank you for your answer. The netmask was fixed and everything redone. It failed again.
I've attached the entire log, second dump, TTTest64 report and entire config file:
I also ran TXOFF before doing whatever I do. The only thing not clear to me is when TXON should be executed.
I'm testing with OSA because it failed with CTCI as well. In general Hercules works pretty fine. The issue only occurs when I want to access PC applications to exploit mainframe facilities.
Activities:
zosexplorer
to zCEE server can be established. I can see services and APIs.I'm using all products included in Hercules, but for example I've seen that WinPCap is not supported and Npcap is recommended. Don't know if that may be related.
Let me know anything else you need.
Regards Carlos
I'm going to need some help reproducing this. I have almost zero z/OS skills.
I do have z/OS 2.4B, and it seems to IPL and run just fine, but for all of my z/OS Hercules IPL tests, I always use loadparm WSM
(WS = CLPA and Warm start of JES2. Base z/OS system functions i.e. no CICS, DB2, IMS, etc. M=Verbose IPL Messages), not CI
.
When I just now tried IPLing my system (which I believe is an ADCD system) using loadparm CIM
(CI = CLPA and Warm start of JES2. Loads CICS 5.3 and 5.2 libraries. Starts CICS 5.3, z/OSMF, and IBM Developer for z Systems. M=Verbose IPL Messages), it's asking me:
IGGN505A SPECIFY UNIT FOR DFH540.CICS.SDFHLPA ON B4C541 OR CANCEL
which I don't know how to respond to. :(
I also don't know what cicsexpl
is, nor what a zosexpl
session with RSED is either. :(
My system is configured to use OSA and is configured almost identically as yours.
Can you explain to me (in simple terms please! I'm not a z/OS person!) what I need to do to reproduce your problem?
Thanks.
p.s. I suspect something might be wrong with your local network configuration. Your TTTest64 report does not look right. Your first ping of www.linux.org using a tun, reported a ping response time of "time=16ms", which is quite reasonable. But then you closed your tun interface and tried the same thing again using a tap interface instead, and this time all of your ping responses were all "time=<1ms"!! I am seriously doubting you can ping www.linux.org from your Windows system in less than 1ms!!
Can you provide some more details regarding your local networking? Thanks.
Can you provide some more details regarding your local networking? Thanks.
And it might be a good idea to try your TTTest64 Ping Test again, but this time without using a tun interface beforehand. Do the test right away using only a tap defined interface.
It might not hurt to clear your ARP cache beforehand too, or even IPL your Windows system.
You said you had to reinstall Windows 10, and as I recall, Windows 10 did have some type of bug in its networking handling at some point in the distant past that was fixed with an update. Did you (re-)install all of your Windows/Security updates after you reinstalled Windows? (and before doing your Hercules test?)
Hi, Regarding the message, it is just because the CICS 54 DFHLPA maybe in the lpalist, but the lib doesn't exist. In such cases you can reply 0,cancel
. All consoles messages at that stage must be replied with 0,something
. Sorry if this it obvious.
zosexplorer is an IBM eclipse application working as a framework for other applications included as plugins, like cics explorer, git interfaces, zos connect enterprise edition, zosexplorer, dbb for zDevOps and others. I start my ADCD z/OS 2.4 with CI as well.
To reproduce my problem you could use zosexplorer. You need two started tasks: jmon and rsed, which I believe are started by default and define a connection to the default eclipse zos explorer port, port 4035.
With my previous Windows I used to have a fixed IP address 192.168.1.110 with a CTCI definition. Unfortunately I didn't keep record of that configuration. So I tried to do the same and now I have this problem. This worked in my other Windows. So I switched to an OSA definition, hoping it would help, but it didn't.
My local networking has DHCP, no other configuration. I erased the ARP cache and re-ran TTTest64 Ping Test:
Regarding the message, it is just because the CICS 54 DFHLPA maybe in the lpalist, but the lib doesn't exist. In such cases you can reply
0,cancel
.
Thanks. That seems to have worked. I had to reply to it twice though. After the first reply, the same(?) message appeared again a minute or so later. After I replied to the second message, the system finally finished IPLing and is now running normally(?).
To reproduce my problem you could use zosexplorer.
Which I know nothing about. :(
You need two started tasks: jmon and rsed...
Which I don't know how to do. :(
...which I believe are started by default...
Good!! :)
...and define a connection to the default eclipse zos explorer port, port 4035.
How can I tell whether zosexplorer
(i.e. jmon and rsed??) is finished and ready for work? After IPLing my system, all processors are running at nearly 100%. How do I know when everything is ready for me to begin my testing? How long (approximately) do I have to wait before trying my test?
What do I do with port 4035? Do I connect a terminal to it? What type of terminal? 3270? Or simple command line?
Once connected(?), THEN what do I do? Do I simply enter some type of command? What command do I enter? What do I need to try doing to reproduce the crash?
Sorry for the stupid questions, but as I explained, I know almost nothing about z/OS! :(
Thanks.
- How can I tell whether
zosexplorer
(i.e. jmon and rsed??) is finished and ready for work?
FYI: When I press PF10 on my master console (to issue the D A,L
command), I do see both JMON
and RSED
in the list of active tasks. Does that mean they're both ready? Does that mean I can do my test? (whatever that test is! I don't even know what I'm supposed to be doing!). Thanks.
one question, do you have a TSO session or just the console?
one question, do you have a TSO session or just the console?
TSO session too. I should tell you I DO know how to do a little teensy tiny bit. I know how to logon. I know how to use ISPF 6 to issue 'ping' and other commands. I know how to browse dataset/edit dataset members, submit jobs (although I do NOT know JCL!), look at printouts, etc. And I know how to cleanly shut the system down. But nothing beyond that,
ok, I can guide you to download zos explorer from IBM site, how to configure a connection to Mainframe and how it's invoked if you are willing to do it. I'll have to take a series of screenshots and send them to you. Please confirm if you want. additionally, do you have TXOFF/ON in your zos?
...if you are willing to do it.
Yes, certainly!
I'll have to take a series of screenshots and send them to you.
That's fine. My email address is fish at either softdevlabs.com or infidels.org.
additionally, do you have TXOFF/ON in your zos?
Not yet, no. But I do have a copy of Jürgen's TXONOFF job stream, so as far as I know all I have to do is run it, and then it'll be on my system. Yes? And then all I need to do is enter the command 'txoff' or 'txon' from ISPF 6 to disable or enable unconstrained transactions, yes?
yes to both... i'm writing the instructions. I'll send them as soon as i can. Thank you very much.
CORRECTION: I just found my notes regarding TXONOFF:
--------------------------------------------------------------------------------
Disable/Enable UNconstrained transactions
Upload Jürgen's TXONOFF to IBMUSER adcd.lib.jcl and run it.
(Be careful during upload! The job stream is CASE SENSITIVE!)
Then to activate, either re-IPL, or issue console command "F LLA,REFRESH"
Then to switch unconstrained transactions OFF/ON
(FOR ALL SUBSEQUENTLY STARTED jobs!), simply do:
"S TXOFF" at the console (or "TXOFF" from a shell prompt)
"S TXON" at the console (or "TXON" from a shell prompt)
Example:
alias cls=clear
export PATH=$PATH:$JAVA_HOME/bin
TXOFF
time java com.ibm.jvm.format.TraceFormat test.trc
time java com.ibm.jvm.format.TraceFormat test.trc > test.trc.fmt 2>&1
--------------------------------------------------------------------------------
Is that correct?
i'm writing the instructions. I'll send them as soon as i can.
Thanks! Standing by...
FYI: TXONOFF is now on my system, and entering s txoff
from the master console resulted in:
- 17.24.51 s txoff
- 17.24.51 IRR812I PROFILE * (G) IN THE STARTED CLASS WAS USED
- TO START TXOFF WITH JOBNAME TXOFF.
- 17.24.51 STC00442 $HASP373 TXOFF STARTED
- 17.24.52 STC00442 IEF404I TXOFF - ENDED - TIME=17.24.52
So I think we're good to go.
Standing by for further instructions via email...
mail sent, let me know if you got it.
Nope. Not yet. :(
What email address did you send it to?
Hi, I'll attach the file here. Regards.
Hi I did two things:
12:07:23.003 0000302C HHC03991D 0:0401 QETH: RRH_TYPE_ULP: PUK_TYPE_DISABLE (ULP_DISABLE): Request
12:07:23.003 0000302C HHC03981D 0:0401 QETH: TH : +0000< 00E00000 0000001B 00000014 00000055 ...............U .\..............
12:07:23.003 0000302C HHC03981D 0:0401 QETH: TH : +0010< 10000001 .... ....
12:07:23.003 0000302C HHC03981D 0:0401 QETH: RRH: +0000< 00000000 417E0001 00000004 00000003 ....A~.......... .....=..........
12:07:23.003 0000302C HHC03981D 0:0401 QETH: RRH: +0010< 00240015 00001505 D8C5E3F3 00000000 .$.............. ........QET3....
12:07:23.003 0000302C HHC03981D 0:0401 QETH: RRH: +0020< 00000000 .... ....
12:07:23.003 0000302C HHC03981D 0:0401 QETH: PH : +0000< 01000015 00000040 .......@ .......
12:07:23.003 0000302C HHC03981D 0:0401 QETH: PUK: +0000< 000C4103 00090000 00000000 ..A......... ............
12:07:23.003 0000302C HHC03981D 0:0401 QETH: PUS: +0000< 00090403 05000101 16 ......... .........
12:07:23.003 0000302C HHC03997I 0:0401 QETH: tun0: not using MAC address 02:00:5e:a3:be:84
12:07:23.003 0000302C HHC03997I 0:0401 QETH: tun0: not using IP address 192.168.1.115
12:07:23.003 0000302C HHC03997I 0:0401 QETH: tun0: not using subnet mask 255.255.255.0
12:07:23.003 0000302C HHC03997I 0:0401 QETH: tun0: not using MTU 1500
12:07:23.012 0000302C HHC03991D 0:0402 QETH: Halting data device
Entire log attached.
I hope this helps.
What email address did you send it to? fishspan>@</spansoftdevlabs.com
Weird. I never receive it.
Hi, I'll attach the file here. Regards.
Thanks. I downloaded it and tried it it yesterday after making the following changes:
I ran IBM Explorer for z/OS from another system on my local network: a Windows 7 x64 VMware virtual machine, but the fact that it was a virtual machine shouldn't make any difference. As far as my Windows 7 host was concerned, it was another system on the local network).
I added a new Inbound rule to my Windows Firewall to let just TCP port 4035 through. That didn't seem to work, so I changed it to Protocol = Any instead, which of course removed the port number restriction, effectively disabling my firewall entirely (i.e. letting anyone connect to anything from anywhere), and that did work:
As you can see below, I was able to connect and things worked just fine (although I didn't know what the heck I was doing! I'm not familiar with IBM Explorer for z/OS!):
I was however able to view files and printouts! Pretty cool!
One thing I did notice was that sometimes the connection would fail on the first attempt. But if I tried again it would work the second time around. I'm not sure what that means, if anything.
I also do not use DHCP. I have all of my systems are hard coded with their own uniquely assigned IP addresses, so it wasn't exactly a fair test. My initial goal wasn't to try and exactly reproduce your problem, but rather just to see if I could get it to work, and I succeeded in that endeavor.
I also have Checksum Offloading overridden in CTCI-WIN too:
Checksum offloading = OVERRIDDEN
(Refer to "Disable CTCI-WIN's default Checksum Offload behavior" in the "Common Problems" section of the CTCI-WIN Help file)
And finally, I do not have IPv6 enabled on my adapter either (whereas you do). Despite all the hype, I've personally never found much use for IPv6. If you have a lot of internet devices maybe you have a need, I don't know. But for me, living without IPv6 is not a problem.
I would also note that at no time during my initial attempts (when my connection attempts would fail and RSED would crash due to a TXF restricted instruction failure), at no time did my system crash. Hercules remained up and running just fine.
If I get time I will MAYBE try to configure my system to use DHCP and also try running IBM Explorer for z/OS on the same system that Hercules is running on, just to see whether that makes any difference or not. I'm doubting it will, but it might be worth a shot.
Personally I think your local Windows network is borked. Your second TTTest64 report is still showing "time=<1ms" for your pings to www.linux.org, which is virtually impossible.
Some things to try/check:
Check your router to make sure 192.168.1.115 (i.e. your z/OS guest's IP address) is not within its DHCP range. If it is, try using a different IP address for your z/OS guest to see if that works any better.
Try defining your QETH (OSA) device in your Hercules configuration as a normal non-DHCP device:
0400.3 OSA chpid F0 iface 192.168.1.37 ipaddr 192.168.1.115 netmask 255.255.255.0
You may or may not want to disable IPv6 on your adapter.
Try uninstalling and reinstalling (and rebooting in between and afterwards?) CTCI-WIN. The fact that your pings to linux.org are still getting <1ms responses tells me there is still something very wrong with your networking setup. I would personally NOT PROCEED with any Hercules testing until you can first get CTCI-WIN to get valid responses to its pings to linux.org. (Does the same phenomenon occur when you try pinging someone else? Such as www.softdevlabs.com?)
That's all for now.
I'll continue trying to reproduce your crash but so far I haven't had any luck.
Hi it's great you made it work. I'll take a look at your info as soon as I have time as I'm in between jobs. I really appreciate your help. Take care.
Is it possible for you to send a couple of screenshots of your fixed IP definitions on Windows? tks
Here you go:
What happens when you try to "ping www.linux.org" from z/OS? (i.e. ISPF function 6) or any other IP address? Does "HOMETEST" complete successfully? Did you configure your z/OS "NSINTERADDR" DNS server values in member ADCD.Z24B.TCPPARMS(TCPDATA)?
I still think it's a problem with your Windows host's networking configuration. Since what started this whole mess was your having to reinstall your Windows 10 "after a problem" (what was the problem by the way??), Windows may have installed a default/generic device driver for your networking adapter during the install. Have you checked with your manufacturer (Realtek?) to see if there's a newer version?
Yes, I'm grasping at straws here! I admit it. But your TTTest64 ping test results keeps setting off alarm bells for me!
I admit however, that no matter how screwed up your host networking is, it shouldn't be causing Hercules to crash! Hercules should, ideally, never crash.
I don't think I ever asked you: is this problem (Hercules crashing) reliably reproducible? Does it happen every single time?
Trying to launch IBM Explorer for z/OS from my Windows 7 x64 host system, I'm getting the following error dialog:
---------------------------
Zosexplorer
---------------------------
Java was started but returned exit code=13
-Dorg.eclipse.swt.accessibility.UseIA2=false
-Djava.util.Arrays.useLegacyMergeSort=true
-XX:MaxPermSize=256m
-Djava.class.path=C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\\plugins/org.eclipse.equinox.launcher_1.5.0.v20180512-1130.jar
-os win32
-ws win32
-arch x86_64
-showsplash
-launcher C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\zosexplorer.exe
-name Zosexplorer
--launcher.library C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\\plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.700.v20180518-1200\eclipse_1705.dll
-startup C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\\plugins/org.eclipse.equinox.launcher_1.5.0.v20180512-1130.jar
--launcher.overrideVmargs
-showlocation
-pluginCustomization plugin_customization.ini
-vm C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\jre\bin\j9vm\jvm.dll
-vmargs
-Dorg.eclipse.swt.accessibility.UseIA2=false
-Djava.util.Arrays.useLegacyMergeSort=true
-XX:MaxPermSize=256m
-Djava.class.path=C:\Users\Fish\Downloads\#489\IBM Explorer for zOS\\plugins/org.eclipse.equinox.launcher_1.5.0.v20180512-1130.jar
---------------------------
OK
---------------------------
According to "https://stackoverflow.com/questions/11461607/cant-start-eclipse-java-was-started-but-returned-exit-code-13" it's because I don't have a 64-bit version of JDK installed.
I've had MANY/MUCH PROBLEMS with Java in the past on my system, so I am NOT going to try installing java to try to fix the problem. Sorry! :(
Having a stable host system for Hercules development is of paramount importance to me and I don't trust Oracle at all. Every damn time I try to do something with java it invariably ALWAYS cause me MUCH GRIEF, causing me to spend a LOT of time and effort straigtening out (fixing/undoing) the damage that Java/Oracle has done to my system!
So it looks like I'm not going to be able to reproduce your environment. :(
We'll just have to try and figure out what's wrong with your Windows 10 networking.
Additional info:
C:\Program Files (x86)\MegaRAID Storage Manager\JRE\bin> java -version
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b88)
Java HotSpot(TM) Client VM (build 25.0-b30, mixed mode)
I'm not sure what "mixed mode" means, but the fact that it's in the "Program Files (x86)" directory tells me it's a 32-bit version of java.
I suppose I could download the 32-bit version of IBM Explorer for z/OS and try that. That might work. Let me think about that...
I suppose I could download the 32-bit version of IBM Explorer for z/OS and try that. That might work. Let me think about that...
The 32-bit version doesn't work for me either. It it gets the same error:
Java was started but returned exit code=13
So we're just going to have to debug this issue on your system instead. I'm unable to reproduce it on mine. :(
Yes, I think it's better because there is a lot of stuff to consider. Anyway, I was running a lot of tests trying to use a fixed IP but at some point nothing worked. So I uninstall everything, I verified that almost all entries in the registry were gone, reinstall again and trying to make it work with DHCP. The range of IPs of my router goes from 192.168.1.30 to 192.168.1.63. so IPs around 100 are not used. I defined two devices in z/OS: one CTCI and one OSA. and both were started okay. I ran all tests with the CTCI one using IP 192.168.1.112.
Once z/OS started, I did this.
HomeTEST:
EZA0619I Running IBM MVS TCP/IP CS V2R4 TCP/IP Configuration Tester
EZA0621I The FTP configuration parameter file used will be "TCPIP.FTP.DATA".
EZA0602I TCP Host Name is: S0W1
EZA0605I Using Host Tables to Resolve S0W1
EZA0611I The following IP addresses correspond to TCP Host Name: S0W1
EZA0612I 192.168.1.112
EZA0614I The following IP addresses are the HOME IP addresses defined in PROFILE.TCPIP:
EZA0615I 10.1.10.1
EZA0615I 192.168.1.112
EZA0615I 192.168.1.115
EZA0615I 10.1.10.1
EZA0615I 127.0.0.1
EZA0618I All IP addresses for S0W1 are in the HOME list!
EZA0622I Hometest was successful - all Tests Passed!
nestat HOME:
EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 16:53:22
EZZ2700I Home address list:
EZZ2701I Address Link Flg
EZZ2702I ------- ---- ---
EZZ2703I 192.168.1.112 ETH1 P
EZZ2703I 192.168.1.115 OSA1
EZZ2703I 10.1.10.1 EZASAMEMVS
EZZ2703I 127.0.0.1 LOOPBACK
EZZ2704I Address Interface Flg
EZZ2704I ------- --------- ---
EZZ2703I 10.1.10.1 EZAZCX
PINGS:
Pinging 192.168.1.112 with 32 bytes of data:
Reply from 192.168.1.112: bytes=32 time=2ms TTL=64
Reply from 192.168.1.112: bytes=32 time=1ms TTL=64
Reply from 192.168.1.112: bytes=32 time=2ms TTL=64
Reply from 192.168.1.112: bytes=32 time=1ms TTL=64
Ping statistics for 192.168.1.112: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 1ms, Maximum = 2ms, Average = 1ms
C:\Users\Carlos>ping 192.168.1.115
Pinging 192.168.1.115 with 32 bytes of data: Reply from 192.168.1.115: bytes=32 time<1ms TTL=64 Reply from 192.168.1.115: bytes=32 time=3ms TTL=64 Reply from 192.168.1.115: bytes=32 time=1ms TTL=64 Reply from 192.168.1.115: bytes=32 time=2ms TTL=64
Ping statistics for 192.168.1.115: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 3ms, Average = 1ms
These PINGS were executed using TSO option 6.:
CS V2R4: Pinging host linux.org (172.67.148.63)
Ping #1 response took 0.006 seconds.
CS V2R4: Pinging host github.com (140.82.112.4)
Ping #1 response took 0.170 seconds.
4. I was able to start a TSO session using CTCI connection.
5. TXOFF executed and RSED started.
6. At this point I attempted a connection from z/OS Explorer. The connection is established as you can see, but at this point the adapter failed. This failure can be consistently reproduced.
Options: CONN TCP TCPIP STACK TITLES ( CLI RSED*
EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 17:01:39
EZZ2585I User Id Conn Local Socket Foreign Socket State
EZZ2586I ------- ---- ------------ -------------- -----
EZZ2587I RSED 00000042 0.0.0.0..4035 0.0.0.0..0 Liste
EZZ2587I RSED1 0000006D 192.168.1.112..9308 192.168.1.36..50679 Estab
and this error is issued:
11.01.15 STC09959 +FEK115E write() failed. reason=(EDC5140I Broken pipe.)
> _EDC5140I Broken pipe._
> _**Explanation:** A write was attempted on a pipe or FIFO for which there was no process to read the data. This message is equivalent to the POSIX.1 EPIPE errno._
> _**System action:** The request fails. The application continues to run._
> _**Programmer response:** Refer to z/OS XL C/C++ Runtime Library Reference for the function being attempted for the specific reason for failure._
7. Hercules and z/OS stopped and started again. I'll try with z/OS Connect.
Using zCEE I could acquire a connection as well. I could do some tasks with services and APIs:
Options: CONN TCP TCPIP STACK TITLES ( CLI RSED ZC
EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 18:09:29
EZZ2585I User Id Conn Local Socket Foreign Socket State
EZZ2586I ------- ---- ------------ -------------- -----
EZZ2587I ZCEESRV1 00000066 0.0.0.0..9001 0.0.0.0..0 Liste
EZZ2587I ZCEESRV1 00000065 0.0.0.0..9002 0.0.0.0..0 Liste
EZZ2587I ZCEESRV1 00000064 0.0.0.0..9000 0.0.0.0..0 Liste
EZZ2587I ZCEESRV1 00000067 0.0.0.0..9003 0.0.0.0..0 Liste
EZZ2587I ZCEESRV1 000000AC 192.168.1.112..9002 192.168.1.36..53681 Estab
EZZ2587I ZCEESRV1 00000063 0.0.0.0..9004 0.0.0.0..0 Liste
EZZ2587I ZCEESRV1 00000047 127.0.0.1..1025 0.0.0.0..0 Liste
8. When I tried to deploy a service using that connection, the adapter crashed again.
When Hercules is stopped, it hangs:
18:17:52.348 0000180C HHC00417I 0:0AA8 CKD file d:/ZOS240/dasd/S1C521: cache hits 460, misses 152, waits 0 18:18:11.513 * "Hercules" forcibly terminated by user request 18:18:11.513 Hercules: kill 0000180C 18:18:11.513 kill 0x0000180C (Hercules)
_**Summary:**_
It's a strange failure, because the connections are established but something makes them fail. I had to reinstall windows because I upgraded motherboard, CPU and memory. Hercules ran very slowly in my other hardware, now it doesn't do too bad. I also installed another Windows, from other ISO file.
and this error is issued:
11.01.15 STC09959 +FEK115E write() failed. reason=(EDC5140I Broken pipe.)
I had this same error before updating my Windows Firewall. The clue is:
EDC5140I Broken pipe. Explanation: A write was attempted on a pipe or FIFO for which there was no process to read the data. This message is equivalent to the POSIX.1 EPIPE errno.
Try temporarily disabling your Windows Firewall, or adding a rule like I did further above. (And don't forget to disable the rule afterwards when you're no longer using it or no longer need it, and/or re-enable the Windows Firewall again if you chose to completely disable it, so that you're network security isn't left in an exposed state! This is just a temporary test after all! Normally you shouldn't be disabling the Firewall, but we need to temporarily do so to determine whether or not it's the cause.)
P.S. Also, if you haven't done so already, you should add a "Ping" rule to your Windows Firewall as well, as explained in the "Add a "Ping" rule to Windows Firewall" topic of the "Common Problems" chapter of the CTCI-WIN Help file.
You might also need to do a network trace (e.g. Wireshark) to find out what's actually going on. That should let us know whether the packets are actually getting sent or not. If they are but the recipient isn't receiving them, it's more than likely the firewall.
You might need to add a custom (specific) Windows Firewall rule to let through all packets from z/OS (i.e. from IP addresses 192.168.1.112 and 1.115).
Hi I did some additional testing with both the firewall completely disabled and with new rules, which I'm assuming, I defined properly. It didn't work. So I installed Wireshark and ran a test capturing 192.168.1.112.
First, I logged into an IP terminal, which was okay, and saw the trace. Then I logged off. That part is not in the attached file.
Then I started a session from z/OS Explorer using Cicsplex manager, and it worked perfectly!
Then I did the usual stuff with RSED, and I got the error.
In the trace you'll see a TLS error:
10255 896.578269 192.168.1.36 192.168.1.112 TLSv1.2 61 Alert (Level: Fatal, Description: Unexpected Message)
then several:
10272 897.015414 192.168.1.112 192.168.1.36 TCP 54 [TCP Dup ACK 10269#1] 4035 → 54492 [ACK] Seq=1 Ack=28 Win=131040 Len=0
and finally, many:
10386 904.250068 192.168.1.112 192.168.1.36 TCP 1514 [TCP Spurious Retransmission] 9308 → 54493 [PSH, ACK] Seq=17 Ack=653 Win=130400 Len=1460.
Port 54993 was the one I was connected with:
EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 13:40:37
EZZ2585I User Id Conn Local Socket Foreign Socket State
EZZ2586I ------- ---- ------------ -------------- -----
EZZ2587I RSED 00000044 0.0.0.0..4035 0.0.0.0..0 Liste
EZZ2587I RSED9 0000005B 192.168.1.112..9308 192.168.1.36..54493 Estab
So I have to see why TLS is involved here, because I thought I disabled it.
I hope this helps.
Please let me know if you want some special print of the trace.
Also... when I installed Wireshark, I was notified about this:
Should I install Npcap?
Hi
I officially installed Wireshhark and rebooted the PC. (I was using the portable version before.)
I started the .112 trace before starting Hercules. At the beginning of the trace, I see the same IP .112 has two macs. Don't know how that can be possible.
After that the same error occurred:
7 153.340184 02:00:5e:a8:01:73 Broadcast ARP 42 ARP Announcement for 192.168.1.112 (duplicate use of 192.168.1.112 detected!)
8 153.340261 02:00:5e:a8:01:70 02:00:5e:a8:01:73 ARP 42 Gratuitous ARP for 192.168.1.112 (Reply)
9 158.600785 02:00:5e:a8:01:73 02:00:5e:a8:01:73 ARP 42 Gratuitous ARP for 192.168.1.112 (Reply) (duplicate use of 192.168.1.112 detected!)
10 189.193929 AskeyCom_79:c8:10 Broadcast ARP 60 Who has 192.168.1.112? Tell 192.168.1.1
11 189.194015 02:00:5e:a8:01:70 AskeyCom_79:c8:10 ARP 42 192.168.1.112 is at 02:00:5e:a8:01:70
12 189.194112 02:00:5e:a8:01:73 AskeyCom_79:c8:10 ARP 42 192.168.1.112 is at 02:00:5e:a8:01:73
New trace:
Bye.
and ran a test capturing 192.168.1.112.
You should have captured both 192.168.1.112 as well as 192.168.1.115 too. Earlier you said:
I defined two devices in z/OS: one CTCI and one OSA.
And we can see the following in your HOMETEST:
EZA0614I The following IP addresses are the HOME IP addresses defined in PROFILE.TCPIP: EZA0615I 10.1.10.1 EZA0615I 192.168.1.112 EZA0615I 192.168.1.115
So it looks like you've defined both IP addresses to your z/OS guest. I'm presuming one of them was assigned to the CTCI device and one was assigned to the OSA device.
Then I started a session from z/OS Explorer using Cicsplex manager, and it worked perfectly!
Fantastic!
Then I did the usual stuff with RSED...
Wait... WHAT?!
You "did the usual stuff"? What does that mean? Do you mean to started (ran) TXOFF and then started RSED afterwards? Is that what you mean? Did you cancel/kill the existing RSED beforehand? Because if you didn't, then you would end up with TWO running RSED instances, which might explain the "TCP Dup ACK" and "TCP Spurious Retransmission" errors you're seeing in your Wireshark trace.
Did you disconnect from your previous session before you "did the usual stuff with RSED"? That might explain things as well.
As far as I know, you can (should) only have ONE and only one running instance of RSED (unless each instance is listening for connections on a completely different port of course). Having multiple server instances each listening for connections on the same server port is a recipe for disaster.
Should I install Npcap?
No.
Well... technically... you can if you want to. But if you do, any problems you might have with CTCI-WIN and/or Hercules networking in general are your own to resolve. Using Npcap instead of WinPCap in unsupported by CTCI-WIN. It might work, or it might not. I don't know. I've never tried it and am not interested in trying it. WinPCap works fine.
Conclusion:
Based on the fact that you were able to successfully establish a z/OS Explorer, it sounds to me like your Windows Firewall was the culprit all along. Which makes sense. Hercules (specifically, z/OS's RSED) was trying to communicate with your z/OS Explorer client, and Windows Firewall wasn't letting anything through. Thus the broken pipe write failures. As soon as you disabled(?) (i.e. "fixed") your Windows Firewall issue, things started working.
NOW your only problem is having two different IP addresses assigned to your z/OS guest because you have two different adapters/interfaces defined: one CTCI and one OSA.
My suggestion would be to choose one or the other and drop the other. Personally I prefer OSA myself. While I'm sure CTCI or even LCS too would both also work just fine, OSA is more modern from a z/OS point of view and thus the device/protocol that z/OS more than likely "prefers", so that's what I'd go with: OSA.
But the choice is yours.
p.s. I personally don't recall having to mess with TLS or SSL at all, so I'm not sure what you're referring to having to do? I didn't have to change/configure anything. I just started IBM Explorer for z/OS, connected, and VOILA! I was up and running.
At the beginning of the trace, I see the same IP .112 has two macs. Don't know how that can be possible.
It's probably because you have two networking interfaces defined in your z/OS guest (one, a CTCI device, the other, an OSA device), and (I'm presuming), you've more than likely defined both of them with the same IP address. You need to either get rid of one of them or else assign it a different IP address.
11 189.194015 02:00:5e:a8:01:70 AskeyCom_79:c8:10 ARP 42 192.168.1.112 is at 02:00:5e:a8:01:70 12 189.194112 02:00:5e:a8:01:73 AskeyCom_79:c8:10 ARP 42 192.168.1.112 is at 02:00:5e:a8:01:73
As you can see from your Wireshark trace, the two conflicting MAC address are "02:00:5e:a8:01:70" and "02:00:5e:a8:01:73". Those are MAC addresses that are automatically generated by Hercules, and correspond to IP Addresses 192.168.1.112 and 192.168.1.115. (x'70' = 112 and x'73' = 115):
If not specified then one will be internally generated in the range 02:00:5E:80:00:00 - 02:00:5E:FF:FF:FF using the low order 23 bits of the IPv4 address. For example, if the ipv4 address is 10.1.2.3 the generated MAC address will be 02:00:5E:81:02:03.
Methinks you need to empty your ARP cache (and/or delete one or both of the entries for .112 and .115), as well as fix your z/OS guest's networking device and IP address assignments.
Once you do that (along with your existing Windows Firewall fix which you've already done), things should start working just fine for you.
AS FAR AS THE ORIGINAL HERCULES CRASH IS CONCERNED...
I'm going to have to presume it's simply a side effect of your Windows Firewall that unfortunately just happened by coincidence impact Hercules. What more than likely happened was whatever packets needed to be sent/received by Hercules as part of z/OS's attempt to halt its networking adapter, got "eaten" by Windows Firewall, causing Hercules to either end up waiting forever for a response to one of its requests, or, for it to wait "too long" (i.e. longer than 20 seconds) for the response.
When that happens, Hercules's "watchdog" thread (who's interval is currently hard coded at 20 seconds) kicks in and notices once of Hercules's guest processors hasn't made any progress for the past 20 seconds (indicating something is very wrong somewhere (no instruction should ever take longer than 20 seconds to complete!)) and so forces a crash dump.
At least that's my working theory anyway.
Do that, and you should be fine.
Hope that helps!
Very simple test:
This test can be consistently reproduced.
Documents attached:
Have a nice weekend.
192.168.1.36 is sending Ethernet frames that are too large, with the IP packet containing a length of zero, see frames 7997 and 7999 in the last trace you provided. Check the network settings on 192.168.1.36.
ok, what should I test? could you be more specific please? are you refering these options?
could you be more specific please?
No, I'm afraid I can't, I don't know anything about your machine(s), or your network. All I know is what the Wireshark trace on the Hercules host showed, i.e. that 192.168.1.36 is sending Ethernet frames that appear to be unusual.
All I can suggest is that you disable any offloads, and check the MTU that is in use.
Ian said:
192.168.1.36 is sending Ethernet frames that are too large, with the IP packet containing a length of zero, see frames 7997 and 7999 in the last trace you provided. Check the network settings on 192.168.1.36.
Thank you for that, Ian! I have not had a chance to download or examine Carlos's latest postings yet. (I just woke up!) I will do so A.S.A.P., but it sounds like you may have already found the problem.
Carlos said:
are you refering these options?
Yes. Due to the way CTCI-WIN works, since your IBM Explorer for z/OS client is running on the same system that Hercules is running on, packets to/from your IBM Explorer client and your z/OS Hercules guest are being intercepted before they reach the actual physical Windows adapter (which is where the offloading actually occurs), resulting in Hercules receiving packets larger than it can handle. (Your Windows host thinks it is communicating with another physical system somewhere out there on your local internet and so is purposely sending "Large" packets to be efficient since it believes your adapter will properly "offload" them to smaller packets.)
But because WinPCap (CTCI-WIN) intercepts them before they reach the physical adapter, the offloading is not happening and Hercules ends up receiving packets much larger than it can handle. This results in malformed packets being received by your z/OS guest (which is why is keeps trying to disable its OSA adapter as part of its error recovery).
Make sure your "Large Send Offload" and "Jumbo Frame" settings are set to "Disable". This is mentioned in the "Disable an adapter's Large Send Offload (LSO) option" section of the "Common Problems" chapter of the CTCI-WIN Help file.
I will download and examine your latest tests as soon as I've had my first cup of coffee.
Can you post your current Hercules configuration file, please? Thanks.
Thank you.
I notice you have switched to using Windows adapter 192.168.1.36, whereas before you were using 192.168.1.37. Can you post another TTTest64 report, please? Thanks.
my IP is changing whenever I started mi pc... now is 37 again... that's why I wanted to use the MAC.
my IP is changing whenever I started mi pc... now is 37 again... that's why I wanted to use the MAC.
Interesting! Usually when you use DHCP, the lease is simply renewed on the IP Address that was already previously assigned, and thus should be stable. I've never heard of a DHCP server assigning a brand new IP address before. I wonder why that's happening? Who's your DHCP server? Your router/gateway? 192.168.1.1? What manufacturer/model is it? (Not important. Just curious.)
FYI: I noticed you've added NETDEV D8-5E-D3-81-FE-1D
to your configuration file. Because you have, you now shouldn't need to specify any iface
parameter on your OSA device statement. You should now be able to just use:
0400.3 OSA chpid F0 ipaddr 192.168.1.112 netmask 255.255.255.0
(When iface
is not specified, it defaults to your NETDEV
value)
my IP is changing whenever I started mi pc
Why not use a static IP address?
NOTE: GitHub Issue #458 (Hercules crash after resume from suspend) is also closely related to this issue.
Hi guys,
I had to reinstall my Windows 10 after a problem. After reinstalling Hercules, I'm now having an odd problem not happening before, regarding my external connection to use other Windows applications on the same PC. I reinstalled all Hercules software.
I also ran TTTest64.exe successfully. I have DHCP default conf in my PC. I always used CTCI connection but fails with no error message. So I tried with OSA, and now I get a dump.
z/OS 2.4 starts ok and the OSA is installed and activated ok.
I can start TSO sessions from any PC in my net, I can use CICSPLEX very well using
cicsexpl
. When I attempt to use azosexpl
session with RSED, the connection crashed and Hercules as well.Dump available:
I'd appreciate any help!
Regards, Carlos
OSA Adapter
Hercules log