panel-attack / panel-game

Panel Attack is a free modern puzzle game inspired by popular games such as Tetris Attack and Pokemon Puzzle League while still maintaining authentic mechanics. Arrange colored panels in rows and columns of three or more to match matches that clear. Panels then fall from gravity and can make chains that give bonuses or attack the other player.
Other
0 stars 2 forks source link

Use UDP for player inputs #30

Open sharpobject opened 5 years ago

sharpobject commented 5 years ago

Hello, let's stop using TCP for gameplay and stop proxying all the inputs between the two players.

This requires at least 2 bits of work. Read about the first one here: https://en.wikipedia.org/wiki/UDP_hole_punching. Read about the second one here: https://gafferongames.com/post/reliable_ordered_messages/. For the second post, we don't need to do exactly what's in the post, we just need some sort of ARQ/FEC. A different sort of ARQ/FEC than that in the post might be reasonable because our game can pack 32 frames of data in 24 bytes.

jon12156 commented 5 years ago

This would mean the server can't save replays anymore, unless they are sent by clients after the game is over.

and spectating can't happen?

Or would we send inputs via UDP to the opponent and the server?

sharpobject commented 5 years ago

We can send inputs to the server as well, for recording replays and spectating, probably with no change to the way it is currently done.

JamesVanBoxtel commented 2 years ago

Is the motivation for this lower lag between players?

Endaris commented 2 years ago

Is the motivation for this lower lag between players?

It should be. Given the fact that we still have games desyncing based on latency, we should keep this in mind to do at...some point. Some considerations to be made for this: Whether you want to use P2P or not should be configurable in the options, the current TCP solution remains available. Extend the client to inform the server about the IP/Port for the UDP connection (encrypted). Extend the server protocol to receive and send encrypted P2P connection info from/to players. Extend the client to send inputs to both the server and the opponent in case P2P is being used. Extend the client/server to send/accept a match abort in case latency is too high. Extend the server to measure round-trip time for both clients so that the match start can be sent to the clients in appropriate intervals (if that makes sense?)

Some other links for resources we might want to use or reference: https://www.love2d.org/wiki/Tutorial:Networking_with_UDP https://www.love2d.org/wiki/lua-enet

Endaris commented 1 year ago

Someone made me read up a bit more on networking multiplayer games and I stumbled upon one somewhat relevant reason to make a change to UDP based communication more of a needed feature: Packet loss

In TCP based communication, if a packet is lost, it is only resent after a delay, the retransmission timeout (RTO), the minimum of which seems to be 300ms on Windows and seems to be commonly 1.5-2x the round trip time (RTT) of a connection. This can be a lot for players that live in disadvantaged locations relative to the PA server. The following "analysis" might be somewhat faulty in details and exact math due to my superficial understanding but the gist of it should still be true.

Let's say the retransmission timeout causes to resend an input after 1.5RTT. With 1 lost packet it therefore takes 1.5RTT to resend the message and 0.5RTT for the resent message to arrive at the server, so a total of 2RTT from client to server. Then the packet has to be sent to the recipient, meaning we have a transmission time of RTO + 0.5*RTT (sender) + 0.5*RTT (recipient) = 2*RTT (sender) + 0.5*RTT (recipient) for an input to arrive on the recipient's end. Every time a retransmission timeout is hit without receiving an ACK, the retransmission timeout is doubled. So in the very bad case scenario of the second packet also being lost, the client waits 3RTT after resending before trying to resend again: 3RTO + 0.5*RTT (sender) + 0.5*RTT (recipient) = 5*RTT (sender) + 0.5*RTT (recipient) In the event of 2 packets being lost, it should be fairly obvious that even on a perfectly fine connection, 3RTO alone bring us to 900ms minimum, definitely bringing us above the rollback cutoff combined with the actual transmission.

An UDP based approach would be much more robust against packet loss as inputs will be included in every message until getting an ACK from the server for any of those messages. A packet being lost would result in a delay of only 16.6ms for each consecutive lost frame for the transmission of an input. The formula for transmission time changes to 0.5*RTT(sender) + 0.5*RTT(recipient) + n * 16.6ms where n is the number of consecutive packets being lost. TCP can be considered robust for connections up to 400ms ping but packet loss can still completely wreck a connection. UDP on the other hand is robust for connections up to one second and is mostly unaffected by packet loss.

All in all, UDP would make PA a lot more resistant against pingspikes and generally produce consistently much more favorable results for players living in disadvantaged locations relative to the PA server. I dare say that if we have UDP, any net-related problems should evaporate for anyone that isn't on the verge of losing their connection completely. After that, P2P should be unnecessary considering that we send data to the server anyway to provide spectate features etc. I'm changing the issue title accordingly and removing the low priority tag as this is likely one of the top reasons of connection related room crashes.

AegisCrusader commented 1 year ago

However, the difference between TCP and UDP is that TCP is a guaranteed connection. When using UDP packets, the sender only attempts to send the packet once and does not care whether or not the packet gets received, the server does not send ACK back. TCP guarantees delivery. TCP also includes error control and systems have buffers from making sure that packets are sent in the correct order. So if, for instance, a packet containing frame 3 reaches the server before the packet containing frame 2 does. Because of how the TCP protocol works, it is able to correctly re-assemble the fragments in order because packets are assigned sequence numbers. It will wait for 2 to come first before 3. TCP ensures that data is not damaged, lost, duplicated, or delivered out of order, which is extremely invaluable to us since even one wrong input desyncs the entire game.

sharpobject commented 1 year ago

GGPO has been open-sourced so you guys could have a look at that if you like. https://github.com/pond3r/ggpo

Endaris commented 1 year ago

@Shosoul I'm aware of that. The way the article suggests it is that we build in our own sequence number + ACK into our UDP communication in order to allow the target system to always correctly identify the correct order of inputs. The sender sends redundant information with each packet for everything it hasn't received an ACK from the recipient yet. That means there will never be a situation of a missing input that screws up the need of reliable order. The recipient sends ACK back for which inputs it already received so that the messages from the sender don't start to bloat sending redundant information all the time. See lua-enet I linked earlier which is a library for UDP that can also do reliable message order. Ultimately, the packet loss thing can seriously screw netplay for some people. I lived in such a place before myself where on a few days I could play online perfectly fine but on most days it was a hopeless mess with room crash after room crash.

@sharpobject This looks interesting. Although honestly more for investigating how rollback is performed on the client and how the data is stored because rollback in PA is still woefully resource intensive. I believe Jam tried to implement running rollback before and the performance was not quite satisfactory. I genuinely don't believe a running rollback mechanism is the right choice for a game where player interaction is by design indirect and delayed. PA is in the fortunate situation to have a 1s buffer before attacks start falling so we can easily afford to let both stacks run out of sync without anything breaking as long as the desync doesn't increase to critical levels. That is where PA is fundamentally different than a fighting/action game that needs both player's inputs for a frame to figure out what happens. I'm going to read a bit through the docs they have and see if there's anything that would be applicable to PA.

At the end of the day this is still not our biggest issue, for most people the game runs just fine with TCP. My goal for a UDP implementation would be to reduce room crashes caused by packet loss. Improving net stability. Going from a B+ to S rank sort of. Not a huge priority because it's not running poorly but arguably nice and personally I definitely still see enough random room crashes to warrant removing the low priority label. I mostly added it to the issue because someone pointed me towards an article mentioning the impact of packet loss on TCP connections and didn't want that insight to be lost as for why we might want UDP.

I think the first step would be to port our existing network to lua-enet and see how that works. lua-enet's default mode is to transmit messages reliably and in order and it has some features that luasocket lacks, e.g. requesting the round trip time from the client-side socket so we can display a ping in the game.

AegisCrusader commented 1 year ago

I'm guess I'm just not understanding how changing to UDP will fix the issue of packet loss, I also don't understand why it would be associated with room crashes either. I'm going to play some games over wireshark and see what we get packet-wise, but I'm having a really hard time understanding why UDP would be a big solution. I'm wondering if there's something else going on in the code.

Endaris commented 1 year ago

We talked this out in VC, I'll include a quick summary of how the packet content would look for UDP packets. Presuming a client starts sending to the server, it will keep sending all of its recorded inputs while designating the framenumber from which it starts until it gets acknowledgement from the server, e.g.

Packet 1: {startframe: 0, inputs: A}
Packet 2: {startframe: 0, inputs: AA}
Packet 3: {startframe: 0, inputs: AAA}

The server receives any packet and compares the inputs inside with the list of inputs it has already gotten. Then it truncates the redundant inputs and sends an ack for how many frames it has recorded so far. Imagining that packet 2 arrives first, the server would record AA and send back something like {ack:2} to signal that it has recorded 2 frames. When packet 1 arrives later, the server notices it only contains redundant information and discards it (with no extra ack).

When the client receives the {ack:2} from the server it will no longer include the first 2 frames in its messages and adjust the startframe of its inputs, e.g. Packet 5: {startframe: 2, inputs: AAA}.

By sending redundant information until receiving acknowledgement from the server, there is never a need to retransmit lost packets and preservation of sequence for inputs is guaranteed. This would be a very basic mechanism to get UDP to work for PA and likely reduce the occurence of rollback scenarios as the TCP retransmission delay is somewhat likely to cause desync spikes big enough to cause rollback logic to wind up. It should be noted that there are some other caveats with UDP as it's not a connection based protocol so we can't just go ahead and implement it on luasocket's UDP library. If lua-enet supported IPv6 this would be a somewhat simple thing because it seems to take care of these caveats but it doesn't so it probably isn't. Personally I have no clue what problems exactly this may cause. Going to ask around.