toddw123 / RotMG_Clientless

Compatible with version X16.0.0
MIT License
11 stars 4 forks source link

Random Disconnects #48

Closed toddw123 closed 6 years ago

toddw123 commented 7 years ago

Alright guys, im sure you all have experienced it. Looking for some feed back on what you guys think the problem could be.

Some things i have tried (that didnt help):

Now, im fairly sure its because of the movement that the bots still randomly DC. I dont know what part of it though. For instance, if you login 22 bots and have them just sit in the nexus and not move, they shouldnt DC (havent really tested this in awhile though). But i cant find anything wrong in the movement code/packet.

I also have added to my copy a logger that logs the last 2 packets sent and received. It might not be the Move packet that causes disconnect because i get a bunch of different packets as the last 2 in/out. Sometimes it will be the NewTick/Move packets as the last, other times it will be Update/UpdateAct, or Ping/Pong. All 3 of those seem to popup as the last packets in/out fairly evenly.

If anyone has some ideas on what it might be or maybe something i should try to solve it, let me know. Would love to get this fixed so they no longer randomly DC. And i will say that it is completely random. I have watched as all 22 bots go for a solid 30 minutes before just 1 of them DC, and after that i will see random ones drop off in no particular order.

My last guess, and ive been trying to follow a few of them to see if this is the case, is that the bots might be walking over something they shouldnt. For example, if the bot tries to walk through a wall it will get disconnected. So maybe the bots are walking over a boulder or something that they should be and its causing the disconnect. It would explain why its random, as their paths (for me atleast) are entirely random. I though the path-finding code was good though, i made it so it wont try to walk on a tile that has any object with NoWalk or OccupySquare or a few other things. But i might of missed something and they could be occasionally walking over something they shouldnt. Idk yet though. Maybe ill add some output to the failure handler that outputs their last x/y location, that way i can check where they were when it happened.

toddw123 commented 7 years ago

I added the output of the last x/y to my bot code and it looks like there are SOME that do disconnect from walking somewhere they shouldnt. Like for one of the bots, it kept ending up at 171.5,149.5 or something like that. And that exact location is a wall. The reason for this might be from allowing diagonal moves in the path-finder. I might try removing diagonal moves and see if that stops some of the disconnects.

But also, not all the disconnects appear to be from walking somewhere they cant. Some of them seem to have disconnected in the middle of a grass field with no objects or anything around that they cant walk over. So idk yet, still something strange happening.

Zeroeh commented 7 years ago

In your packet logger does the failure packet ever get logged? Usually the failure packet is sent to dc you when you use hacks and such, so I'm kinda curious.

toddw123 commented 7 years ago

You will get a failure packet for a large large LARGE number of reasons, and none of them are related to hacks as far as I can tell.

You will always get a failure packet from the server when getting kicked UNLESS the server is offline OR you fail to respond to the server after like 10 seconds.

But the failure packet is extremely generic. It uses predetermined messages basically. For example, when the server restarts you will get a failure packet with the message "Server Restarting" (clearly not hack related). But other then that message, 99% of the time it will just say "connection to server lost" or something like that.

Zeroeh commented 7 years ago

Normally you get "lost connection to server" when doing naughty things. I think it might be related to the issue with bombs. Like, medusa bombs and such. If you put a bot on the beach and drag one of those pirate leaders that throws a bomb, the bots with all take damage regardless if they were in the blast radius and dc.

toddw123 commented 7 years ago

"naughty things" isnt always true. Of course you can DC from something hack related, but not all failure packets are from "naughty things".

And no, atleast for my bots, the issue is not related to anything with bombs or projectiles. My bots do not enter any of the realms, and do not track projectiles. In the nexus, there is never a need to send any kind of projectile acknowledgment packet.

And to go over again how the failure packet's "lost connection to server" is extremely generic, ill give you a few examples of things that can cause it:

There are many many many many more reasons of course. These can fall under your "naughty things" if you consider any of them are done on purpose (ie hacking), but these things can happen on accident as well. And again, there are a bunch of different failure packet messages you can get as well. You will ALWAYS get a failure packet when the server dc's you, its not ONLY when "hacking/doing naughty things". So to try to use a catch-all and say you only get a failure packet when "hacking/doing naughty things" is a bit of a stretch. A good example to look at that nearly everyone has experienced atleast once in their playtime is disconnecting when going to oryx castle. Your screen starts to shake and next thing you know, you get DC'd. Guess what? You got a failure packet. For what? I dont know, but you did and its the reason you got disconnected. It doesnt have to mean you where hacking or anything, the server just fucked up and kicked everyone. Etc etc etc etc

So anyways, i would really like this post to be about what i created it for, and that is to bring forward possible causes and solutions to the random disconnects that happen. I dont want this post to continue on and on about how failure packets may or may not be related to "naughty things" and throwing out possible scenarios that can cause a disconnect that are unrelated to the actual code/actions of these bots. So while i appreciate your comments, i would prefer to focus more on the actual code at hand here and figure out where the problem is in it.

VoOoLoX commented 7 years ago

Here is an idea, try to connect it to the private server that way you have the full control over what's happening, and should be much easier to debug this problem.

Zeroeh commented 7 years ago

@VoOoLoX Only problem is that private servers aren't an exact replica of the actual game. Even if it looks similar there are some backend things we don't see.

toddw123 commented 7 years ago

yeah using a private server to try and debug it probably isnt the best idea. for one, a private server might not be as updated as the current packet structures in the bot. And like Zeroeh said, the private servers arent going to work exactly the same as a production server. So while the production client could disconnect you for moving just a fraction too fast, a private server might not disconnect you unless you are going insanely too fast.

I still havent figured out exactly what is causing the random disconnects yet. Like i said a few posts up, i did notice a few of the disconnects were due to trying to go over this one spot that is a wall and that would cause the disconnect obviously, but that was only on a few of the bots and wasnt consistent as a reason. Obviously id like to get it to avoid doing this, and i though it was suppose to avoid this (and they appear to avoid it for the most part), but clearly they do occasionally try to go through an object they cant lol. But this is just a small fraction of the disconnects like i said earlier. The majority of them seem to happen in the middle of the large field at the bottom of the nexus, and when i go to the x/y of where they disconnected they are no where near any object or anything that would prevent them from walking there or anything. So not sure what is causing those ones still.

Zeroeh commented 7 years ago

Found this article and though of this. Interesting read. Perhaps it could have something to do with the movement. http://blogs.asterisk.org/2016/06/01/float-conversion-bad-released-13-9-1-regression-fix/

toddw123 commented 7 years ago

I'll look at the article when I get a chance, but from the title I can say that it was part of the issue before but I don't believe it is an issue anymore. Originally the bots used floats for all location based data, since the packets use floats. But for some reason the accuracy of the floats caused problems and it was the reason I could never get the bots to move at 100% speed. When I switched all the location data to use doubles instead of floats it fixed the problem and now the bots can move at 100% speed no problem. The packets still use floats unfortunately but there's nothing I can do about that.

So ultimately there's no other floats in the code that can be changed, if the issue is related to it. Until the stupid packets get changed to use doubles instead at least. But I don't believe floats are an issue anymore once I changed everything possible to doubles. But they definitely were an issue a month or so ago lol.