Closed lolydaggle closed 6 years ago
@lolydaggle I love example videos! Thanks for making it 👍 If I interpret this video correctly, it seems to work the first time, but not when trying again correct?
It was a little hard to tell because the first 20 seconds or so were non-HD for me, but it appeared the first (successful) connection was to a bot host (Bot73_NJ_USA), while the second (failed) connection was to a user host (fernaffen). The bot had 0 players connected, while fernaffen had 2 players connected (the maximum for Blue vs. Gray).
I wonder if it has anything to do with the maximum number of players reached? Let me try to repro that scenario locally.
Let me try to repro that scenario locally.
So that worked fine for me, both with and without a lobby. 😞
@RoiEXLab Yes, I can connect to bots, but not to player hosted.
@ssoloff I tried it again with a different map and it still showed the same.
@lolydaggle Is this the case for all player hosted games? If not it could be the case the users are using a different version which could cause this problem.
@RoiEXLab Yes, I can connect to all bots. It doesn't seem to be an issue with versions, as when I tried it with the person I usually play with, we both had the same version. It was working fine for a while, but suddenly stopped working after a couple of weeks.
I was looking around the client connection code to see if there is any additional logging we can enable in 1.9.0.0.3635, but I didn't see anything substantive. However, it would be useful if we could get a stack trace to see where exactly the code is hung. We could do that pretty simply using the jstack
tool, but that requires having a JDK installed as opposed to simply a JRE.
@lolydaggle Do you happen to have the Java 8 Developer Kit installed? If not, would you be willing to install it? You can tell if you have a JDK installed by looking at your Java installation folder (typically C:\Program Files\Java
) and seeing if there is a subfolder that begins with jdk1.8.0
.
Once a JDK is available, you can use jps
to get the process ID of TripleA:
C:\>jps
1204 Jps
3788 TripleA.exe
And then using jstack
to get a dump of all threads (that's a lower-case "L" in the switch after the jstack
command):
C:\>jstack -l 3788
2017-08-09 17:26:02
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode):
<. . . snip . . .>
@ssoloff So i installed the dev kit, but how do I do the thing with the jps? Specifically what do I do from here? Or do I go from there at all?
@lolydaggle Great! I outlined the steps you need to follow below.
First, you'll need to open a command prompt. See here for various ways to do that if you're not sure how. The command prompt should open to your home directory (e.g. C:\Users\Me
).
Next, you'll need to set the path Windows uses to find executable files to include the JDK. Type the following in the command prompt and press ENTER:
PATH C:\Program Files\Java\jdk1.8.0_144\bin;%PATH%
(You can also copy-paste the above command directly into the command prompt window. Note, however, that you can't use CTRL+V to paste in a command prompt. Instead, right click your mouse in the command prompt window and select Paste on the context menu that pops up.)
Start TripleA and get it into the state where it's hung.
Back in your command window, run the jps
command to get the list of all Java processes on your system:
C:\Users\Me>jps
4864 TripleA.exe
4944 GameRunner
976 Jps
The output from your command will most assuredly be different. However, you should have both a TripleA.exe
process and a GameRunner
process. The TripleA.exe
process should correspond to your lobby window, while the GameRunner
process should be your hung game.
You'll want to write down the number in front of the GameRunner
process because this output is going to quickly scroll off the screen when we move on to the next step. If you happen to have more than one GameRunner
, don't worry, just write them all down; we'll run the command below for each of these processes.
Finally, you want to run the jstack
command to get the stack traces of the hung GameRunner
process:
C:\Users\Me>jstack -l 4944
2017-08-09 17:26:02
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode):
<. . . snip . . .>
As I noted in my previous comment, the letter in -l
is a lower-case "L". Also, you'll need to replace 4944
with whatever number was printed before the GameRunner
process when you ran the jps
command above (if there were multiple GameRunner
processes, just pick the first one for now).
You should have seen a few screenfulls of information fly by when you did that. If so, good. If not, verify that you typed the command correctly and used the correct process ID.
Now, you're going to re-run the command and save all that output to a file. So, type the same command you just ran (or press the up arrow to bring back the previous command you typed), but before you press ENTER, add the following: > GameRunner-4944.txt
. That is, the full command will be:
jstack -l 4944 > GameRunner-4944.txt
Again, replace 4944
with the GameRunner
process ID from your system.
If you had multiple GameRunner
processes when you ran jps
, go ahead and repeat the previous command for each. For example, if you had a second GameRunner
process with ID 1234, type:
jstack -l 1234 > GameRunner-1234.txt
You should now have one or more .txt
files in your home directory (or whatever directory your command prompt was running in) containing stack traces. Go ahead and attach each of them to this issue. Once attached, feel free to delete them from your machine.
@ssoloff Alright, I did it and attached the text file. Hopefully I did it right.
@lolydaggle
Hopefully I did it right.
Like a champ. :+1: I see the thread that's hanging. I'll try to dig into it later tonight or tomorrow.
@lolydaggle I took a closer look at the stack traces you provided and believe we're closer to understanding what's going on.
Here's the stack trace for the thread that's hung:
"AWT-EventQueue-0" #15 prio=6 os_prio=0 tid=0x0000000017f22800 nid=0x21e4 waiting on condition [0x0000000018bcd000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000eff9c5f0> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(Unknown Source)
at java.util.concurrent.CountDownLatch.await(Unknown Source)
at games.strategy.util.EventThreadJOptionPane.awaitLatch(EventThreadJOptionPane.java:62)
at games.strategy.util.EventThreadJOptionPane.showMessageDialog(EventThreadJOptionPane.java:39)
at games.strategy.util.EventThreadJOptionPane.showMessageDialog(EventThreadJOptionPane.java:24)
at games.strategy.engine.framework.startup.mc.ClientModel.createClientMessenger(ClientModel.java:163)
at games.strategy.engine.framework.startup.mc.SetupPanelModel.showClient(SetupPanelModel.java:61)
at games.strategy.engine.framework.startup.ui.MainFrame$2.run(MainFrame.java:86)
<. . . snip . . .>
First, the root cause of your problem is definitely a network connectivity issue. From the above stack trace, execution passed through ClientModel.java:163
, which is an error handler for the case "Unable to connect". Normally, a dialog would be displayed providing you with information on what the underlying error is. However, due to a bug in 1.9.0.0.3635, this dialog is never displayed and the UI freezes, as you observed. That bug was reported as #1493 and was fixed a few months ago in #1558. The details are unimportant other than to say that when we attempt to display the error dialog, we deadlock ourselves and code execution on the UI thread stops.
What we really need to do is get more information why you can't connect to these hosts. Hopefully, the error message that should be displayed would actually help diagnose that problem. Since you can't see this information in 1.9.0.0.3635 due to the above bug, I would recommend that you install the latest pre-release version and reproduce the problem there. Then you can report the network error message, and we can try to help you figure out why you can't connect to these hosts.
You should be able to install a second instance of TripleA as long as you choose a different installation folder. If you don't feel comfortable doing that, you can create a new user on your machine (e.g. for testing purposes), and install the latest TripleA as that user. That way any changes the new version may make to your settings will be isolated from your original 3635 installation.
@ssoloff So I installed the latest pre release, 6217. When I tried to connect, this error code popped up, saying the connection is refused.
@lolydaggle I'm by no means a networking expert, but my guess is that there is a firewall somewhere between you and the game host blocking the connection.
There are some things you can do to try to track down the firewall that's blocking you. Once you know the offending firewall, you may or may not be able to do something about it. An example of "may" would be your own firewall blocking outbound traffic to a range of IPs (typically seen in parental controls). An example of "may not" would be that you live in a country where the government is selectively blocking traffic. :smile:
In order to do anything, you'll need to get the IP address and port that the game host is using. You can do this by following a similar procedure to what you did above with jstack
. Starting from the point where you used jps
to get the ID of the GameRunner
process, run the following from your command prompt:
jinfo -sysprops 7777 | sort | findstr /b triplea.
where you replace 7777 with the process ID you got from jps
.
This should display a few lines of output. You're looking for the lines triplea.host
and triplea.port
. That's the public IP address and TCP port of the game host you're trying to connect to. (Note that this information is transient and is no longer valid once the host disconnects. Keep this in mind when you try the techniques below because, if the host disconnects, any testing you do will no longer be valid, and you will just be spinning your wheels. I would keep the lobby running so you can see when this particular host disconnects.)
First you want to check if you can ping the host, so type the following in your command prompt:
ping 1.2.3.4
where you replace 1.2.3.4 with the value of the triplea.host
property you saw above. If you can ping, great. If you can't, it doesn't necessarily mean anything because the host could just have ICMP echo disabled. However, it would still be interesting to see if your packets are making it all the way to the host, so type this command next:
tracert 1.2.3.4
This command could take a few minutes to run. If you start seeing a bunch of lines with asterisks (*
) in each column, you can just cancel the command with CTRL+C, as you've pretty much hit the end of the road. The last line with meaningful information is potentially the node that is blocking you.
To verify that the connectivity issue has nothing to do with TripleA, you can try to Telnet to the host manually. The Telnet client isn't installed in Windows by default, so you'll need to go to Control Panel > Programs and click Turn Windows features on or off. When the Windows Features dialog opens, scroll down to Telnet Client, check it, and press OK.
Go back to your command prompt and type:
telnet 1.2.3.4 8888
where you replace 8888 with the value of the triplea.port
property you saw above. We're expecting that you get some kind of error. If, for some reason, you can connect (as evidenced by a black screen), press CTRL+] and type quit
followed by ENTER to close the Telnet client. If you do not get a connection error when using Telnet, please report that here, as it means there may be a problem with TripleA.
The final thing you can do is also, unfortunately, the most complicated. It involves capturing network traffic while you attempt to connect to the game host so you can see the packets that are sent back from the firewall that is refusing the connection. If the firewall has been configured to respond with icmp-port-unreachable
, the source address in the packet should be that of the blocking firewall, and you can use that to determine if it's your firewall, the game host's firewall, or someone in between (e.g. your ISP).
On Windows, the easiest tool to use to do this is Wireshark. Unfortunately, trying to explain how to configure Wireshark and run it to do what I described above would probably quadruple the size of this already-long post. :smile: If you want to give it a shot, you can find many tutorials (including videos) online describing how to use Wireshark.
However, I can outline the basic steps you would need to follow:
jinfo
command I listed above and record the host and port values so you know what destination to look for in the capture.If you can get a capture of that scenario, feel free to zip it up and post it here, and I'll take a look at it.
@ssoloff the 'host' should be the lobby, right?
@lolydaggle Wireshark can be a bit difficult, there is a telnet on windows, can you let us know if you can establish a telnet connection to: 45.79.144.53 port 3304 (https://www.youtube.com/watch?v=e_qrl04H8Bk) A second thing to try, if it's reasonably safe, disable any firewalls you know of and see if the problem goes away. That could help isolate the problem quickly, though take care to re-enable the firewalls. Please let us know how it goes if you would, thanks for reporting.
@DanVanAtta
the 'host' should be the lobby, right?
I don't believe so. When joining a game from the lobby, a new TripleA process is launched and the triplea.host
system property is set to the IP of the game's host, which is either a bot or the public IP of an individual user.
I just connected to both a bot-hosted and a user-hosted game from the lobby. For the bot-hosted game, the triplea.host
and triplea.port
properties pointed to the endpoint 45.33.80.67:40010
. For the user-hosted game, these properties pointed to the endpoint 207.237.xxx.yyy:3300
(obfuscated to avoid identifying the user).
Ah, indeed my mistake, I was too focused on " game through the play online option"
@prastle This issue reminds me of the issue we have with scousemart's bot.
However I made the UI non-blocking recently, so the window shouldn't freeze any longer. Closing this
GameRunner-9704.txt
My Operating System:
Windows 10 Pro
TripleA version:
1.9.0.0.3635
Map:
n/a
Can you describe how to trigger the error? (eg: what sequence of actions will recreate it?)
https://www.youtube.com/watch?v=1TFtvJcVI5A When trying to join a user hosted game through the play online option, it opens the main menu window and won't go anywhere from there. It is un-interactable, and can only be closed through the task manager.
Do you have the exact error text?
no, unsure if there is
Instead of this error, what should have happened?
Go to the select screen for teams and such.
Any additional information that may help:
It was suggested that it might be a network problem, and if it was then if I waited long enough an error console would pop up. I left it open for a while, around 20 minutes or so and no console popped up.