touhoumj / gensou

Implementation of the Touhou Unreal Mahjong 4N server
GNU Affero General Public License v3.0
0 stars 1 forks source link

Collab Proposal !! #2

Open NicolasTurpin opened 2 months ago

NicolasTurpin commented 2 months ago

Hey Hello !

I'm the guy who translated and modded THMJ-4N. A few days ago I discovered some streamer playing the game again and it made me feel nostalgic. You can imagine how sad I was to see that the Online wouldn't work anymore and started investigating a way to allow people to play LAN games... And then I just discovered what you did and I'm super impressed !!!

If I understand it well : One player will have a python socket server hosted ; while the other players should modify the Socket.lua script to connect to the host, right ? So you're not connecting to an online router or anything ?

Well... I know it's a bit late now that the new THMJ will release on Steam, but 4N is my baby and I want to keep it alive !

So : Are you in for a big merge of what we both did ? 1) The client patch you did would be integrated to my patch but you could set the host adress and port in the options instead of manually modifying the lua files. 2) I could implement locally saved stats / Quick Matches (even if it's not massively usefull for LAN) 3) add some security proofing with deserialization, etc... 4) Hell we might even allow for more that 4 people to join a server and play simultaneous games in case you want to do tournaments or so !

In any case I wanted to patch some new stuff so that may be the occasion ^^

Don't worry I'll not do it without your approval and without crediting you !

That's said, have a nice day and see you later ;D

chinponya commented 2 months ago

Hey. Thanks for reaching out! I'll address your questions first.

If I understand it well : One player will have a python socket server hosted ; while the other players should modify the Socket.lua script to connect to the host, right ? So you're not connecting to an online router or anything ?

Yes, but the python server is meant to be hosted on a publicly available machine (with a public address), but this could work just as well on LAN networks, provided other clients know the address, as there's no server discovery. The main thing which would prevent you from running servers on the client side is NAT, requiring users to have the ability to port forward.

  1. The client patch you did would be integrated to my patch but you could set the host adress and port in the options instead of manually modifying the lua files.

There's a game bundle which includes this patch, configured to an existing server, on the /mjg/ repo. To keep this repo legally in the clear I haven't linked to it anywhere. It doesn't offer a way to configure a different server easily, so there's certainly room for improvement.

  1. I could implement locally saved stats / Quick Matches (even if it's not massively usefull for LAN)

This would be an excellent feature. I haven't bothered doing it personally since I found the process of modifying the game and testing changes to be incredibly tedious.

  1. add some security proofing with deserialization, etc...

I think the Lua eval on packet data needs to be completely replaced with another serialization format for this to be sound.

  1. Hell we might even allow for more that 4 people to join a server and play simultaneous games in case you want to do tournaments or so !

The server actually replicates the original lobby feature, so running multiple games in parallel just works.

Now, onto my side of things. You should see this server as no more than a proof of concept. There are many fundamental issues with the server as well as how the game networking works. I should warn you at the start that this will entail more work than it seems. Personally I didn't dare to dive all in, especially while knowing that there would be a replacement game coming out soon.

The first, fixable server issue: the state management. Since everything happens in one python thread, which holds this state in a global variable, any crashes will break the game for everyone currently playing. This also may restrict the server capacity. Adding in a database (which would also make hosting on users' side significantly more complex) would be the right fix. I'd be willing to do that, since now I have a person with a legitimate interest 😄 . Server side stats would also become possible then, but I do think that storing them locally is strictly superior anyway (albeit likely more difficult).

What really broke me were stability issues the cause of which I was unable to track down. The game crashes frequently (at least under wine/proton) and due to how the networking is implemented, the player will be replaced with AI without the ability to come back into the game. On average this ruins every other game I played. I'm pretty sure this has nothing to do with my patches or the server though, but I'm not certain. If it is something I introduced, it must be due to the keepalive property of http connections, since the built-in http client seems broken in various ways.

We might not be able to deal with the game crashes, so implementing a rejoin feature could be worthwhile.

There are also stability issues related to networking. The game sends way too many polling requests and it behaves poorly under high latency conditions, often to the point of not being able to play it at all. Any packet loss seems to cause players to drop out.

This is the hardest problem, because I believe the only solution is to rewrite the client networking. Ideally we'd use a persistent (web)socket, so that the clients won't have to poll the server 50 times per second. It would also allow us to use a more suitable language/runtime for implementing the server. The initial choice fell on python due to the fact that it had an http server the game was happy with - more modern implementations seem to outright reject the http requests sent by the client and I was never quite sure why.

If you think that swapping out the client networking is on the table, I'd be down to rewrite the server. I think it will always be terrible unless this is done.

I should point out however that I've never experienced what playing on the official server is like - it was already broken when I first discovered it. Maybe I'm just making shit up and there are ways to make it work well without a networking rewrite, in which case what I really lack is the ability to debug the game client. I've been logging every packet from every client and I could never figure out what was wrong.

chinponya commented 2 months ago

tl;dr we can do what you suggest, but it will suck, since the client and server suck to make it not suck there's a lot more to be done, but it might be doable

NicolasTurpin commented 2 months ago

Thank you for your reply :D

I'll try to answer as best as I can without omitting anything ^^

So yeah the /mjg/ repo led me here, I never included any exe or illegal stuff in my repo as well but I think nobody would mind anymore : it was such a pain in the butt to get enough CDs for 4 players hahaha xD but yeah let's keep it legal as much as possible.

"""The main thing which would prevent you from running servers on the client side is NAT""" Okay so I'm not the biggest expert on network stuff (at all), but wouldn't NAT require a router running h24 ? If that's not the case that could be great news ! But that would be very surprising haha

"""I think the Lua eval on packet data needs to be completely replaced with another serialization format for this to be sound.""" 100% agree with you. And for what it is sending it doesn't sound very difficult to achieve ! I always hated these string eval stuff.

"""The server actually replicates the original lobby feature, so running multiple games in parallel just works.""" That's actually soooo great !!! I would actually be willing to setup a community server for this, as long as it can be defended !!

"""especially while knowing that there would be a replacement game coming out soon.""" Yeah at this point it is more of a museum presevation matter that actually reviving the game x) But this game is so important to me, for a really bizarre reason, I want to fight for it xD

"""Since everything happens in one python thread, which holds this state in a global variable, any crashes will break the game for everyone currently playing""" There's actually veeeeeery easy ways to get through this (at least this time it is in my range of competences haha) I can handle that easy ;3

"""I'd be willing to do that, since now I have a person with a legitimate interest""" Is this person.. Me ? Or is there actually other people interested in this ? hahahaha Moving the saved data to client side will be great, I'll start looking into it and it surely be achievable ! (And for the life of me I couldn't understand why this wasn't the case from the very start anyway --- Even Solo general stats were saved online. That's just dum)

"""The game crashes frequently""" This is quite curious, I've played quite some online games with friends and strangers and I never experienced a crash. Did you do the manual update beforehand ? Was there a consistent setup (table/character) that causes the crash ? There actually was a table (the aotenjou one where you could only win points) that would bug the stats and sometimes crash but I fixed it. If you still have the replay datas from these game I can look for it !! I've actually looked up how to implement a re-join feature and I must admit it seems to be the most complicated task to do. As the games are not saved at each states but instead are saving each inputs (wich can't allow for replays to go back in time) and rejoining would imply loading a crashed replay and simulating the whole game before giving the hand to the player again (I don't know if I'm clear haha ^^")

"""It would also allow us to use a more suitable language/runtime for implementing the server.""" """If you think that swapping out the client networking is on the table, I'd be down to rewrite the server.""" Naaaaah we can keep python as the server language hahaha Jokes aside, maybe having a Lua server would make more sense ? maybe not ? I actually don't know... at all. From what I saw, the basic replay saving and networking workflow were basically the same in terms of tasks/requests but I'm just a python pipeline dev and everything network related is a bit tough to me ^^" But, I mean, from what I saw it wasn't super complicated at all. We can surely improve it !!

"""I should point out however that I've never experienced what playing on the official server is like""" Rrrrraaaah this makes me think we could actually rely on the current network client because it worked really good on the official server. I'll have a look on your server script and do some test in a week or two when I come back home !!

Sorry for my poor writing, I haven't slept but I wanted to answer you tonight as I'm leaving for 10 days without internet. I'll be right back soon and ready to tackle some challenges ;D

Have a nice day and see you around !!

NicolasTurpin commented 2 months ago

tl;dr we can unsuckize it ! It only depends on if we want to invest time in an lovely yet dead/outdated game x)

chinponya commented 2 months ago

wouldn't NAT require a router running h24 ?

With how it works currently, regardless of whether the clients are behind a NAT or not, the server needs to run 24/7 for players to be able to play online. The same was true for the official server. The alternative is to rewrite the networking to be p2p, which is not only very difficult and hard to get right, but will still require a 24/7 server for NAT-punching, which allows 2 clients behind a NAT to connect to each other.

Hosting it isn't a big issue and could be done for free. I'm already hosting the current implementation. We should just provide a way to configure the server to use without having to re-bundle the thmj4n.p file, but that's trivial.

I think the Lua eval on packet data needs to be completely replaced with another serialization format for this to be sound.

I've looked a bit into it and I believe cbor would be a good fit. It's compact, has a great lua library and can encode/decode all lua terms just fine, unlike something like json.

There are other places where replacing the format would be nice. Any GET requests the game makes expect a CSV response, which is quite silly. Swapping this out with cbor would be nice.

This is quite curious, I've played quite some online games with friends and strangers and I never experienced a crash. Did you do the manual update beforehand ? Was there a consistent setup (table/character) that causes the crash ?

Yep, the game was updated. I haven't found a consistent setup - it seems completely random.

I believe the crashes have something to do with running the game under wine/proton, which quite a few people who bothered to try this game use.

the games are not saved at each states but instead are saving each inputs (which can't allow for replays to go back in time) and rejoining would imply loading a crashed replay and simulating the whole game before giving the hand to the player again

That's about what I imagined implementing reconnects would be like. It should be good enough if there's a way to simulate the game state up to the most recent event, without slowly animating everything.

The crashes shouldn't be caused by any game state, considering that it only happens to one player at a time, instead of all of them, which should be in the same state. If that's the case, restoring the game shouldn't cause a crash to happen again.

This mechanism would also be needed for recovering from network failures, either by simulating a batch of missed events or all of them.

If you still have the replay datas from these game I can look for it !!

I did not consider checking the replay file, since I had assumed it wouldn't be saved after a crash. I'll keep this in mind when I'm able to reproduce a crash.

this makes me think we could actually rely on the current network client because it worked really good on the official server.

I have two theories for why the networking is unreliable:

  1. The route between the client and the server is bad for some players. To address this, the client should be able to deal with socket disconnects and timeouts during a game, which I don't think it does. More testing would be necessary. I haven't experienced this failure mode personally, so it's hard to say.
  2. High latency clients are locking up the server thread, since there's just one, causing other requests to timeout. It's possible that making the server multi-threaded would magically solve that, but I'm still leaning towards replacing the client's networking - it feels very poorly designed and it sends a lot of requests (around 250 per minute), likely making any network issues that occur spiral out of control.

"""Since everything happens in one python thread, which holds this state in a global variable, any crashes will break the game for everyone currently playing""" There's actually veeeeeery easy ways to get through this (at least this time it is in my range of competences haha) I can handle that easy

I should clarify that I haven't actually observed any crashes in my server code. The ones that did happen were due to some evil bots that scan the entire internet, trying to exploit vulnerabilities, causing the server worker to timeout, which gunicorn would then restart.

What kind of solution do you have in mind? From what I've seen, introducing multi-threading with shared state in python is quite annoying. Using an external database on the other hand would essentially require rewriting most of the server anyway.

chinponya commented 2 months ago

A lot has changed over this past week.

The client portion of the repository has been moved to a separate one: https://github.com/touhoumj/gensou-client/

You may have also noticed that the repository has been moved from privately owned to an organization. This should help with discovery of anything related to the project.

There's actually another private repo in the org with the game sources, from which the patches in the client repo are generated. I can give you access to it if there's something you'd like to work on. This way it will be easier to collaborate.

We actually have a tool in the client repo which can load modified portions of the thmj4n.p file from a directory, so that we don't have to re-pack it every time we want to test a change. It's not reliable enough to use as main means of distributing game modifications, but for development it been invaluable and it would be impossible to get as far as I did without it.

And finally, the server has been rewritten. It is now at the point of feature parity with the previous implementation. There's still a lot left to do (most importantly, reconnects), but it should serve as a better foundation for any future work.