blueboy / portal

This portal repo is for development purposes only
http://github.com/blueboy/portal
GNU General Public License v2.0
27 stars 24 forks source link

Druid, warlock (perhaps all spellcasters?) lock up #10

Closed kennumen closed 12 years ago

kennumen commented 12 years ago

What we hoped to be fixed by the turn-and-face fix is still a problem. I have seen my warlock and druid still lock up. They won't react at all, won't move, turn or attack. After a little while (30seconds?) they return to normal. Haven't pinpointed exact reproducible circumstances - certainly occurs [sometimes?] while fighting multiple mobs. Keep on fighting, take some time for looting every few mobs, the error should occur soon enough. Should work (or rather, fail) with just level 1 bots. Mine was a 5man party, the warrior and (healer) priest were just fine meleeing.

On a possibly related note, bots appear to be casting 'backwards', as in, to a target (which -at least- to the player appears to be) behind the bot. Is it possible your turn-and-face-target code isn't communicated to the player client(s), or is something more hinky going on?

blueboy commented 12 years ago

Hmm,

The 'turn-and-face' wasn't meant to be a cure, but a treatment for the symptoms. I was getting an unknown spell failure (134) and the handler should cancel the current incomplete spell and if the target still exists turn the bot. Ideally we need to find why this error occurs in the first place. I have also seen spell failure (67) that is even more ambiguous. It's either

SPELL_FAILED_NOT_READY or SPELL_FAILED_CUSTOM_ERROR_67 = 67, // "You already have the maximum amount of honor."

The only honor my bots have got recently is from the 'Midsummer Fire Festival", but this is a long shot.

I did recently removed a SetFacingTo() call from PlayerbotAI.cpp that was causing the bots to appear to double back when attacking a target, ( https://github.com/blueboy/portal/commit/df11e7913c1b3911c2203667a2e95a8f944a4b6c#L0L2303 )

I will make this my next priority..

kennumen commented 12 years ago

It should be safe to ignored error 67. With that description, the only reason I can think of it being called is when the bots have the maximum (Wrath: 75k) amount of honor and they just killed an honorable kill (which I guess is a spell effect to increase the honor? That would be odd).

Actually, scratch that. The mere fact that you mentioned it warrants looking into. This 'glitch' is turning stranger and stranger lol. Thanks for looking into it.

kennumen commented 12 years ago

Okay, some more clues after more testing.

They genuinely get stuck. They won't move, and they're stuck in a meleeing rotation. The reason they seem stuck rather than meleeing is because they're generally out of range, and they're also stuck in orientation so even if a mob is in melee range but behind them, they'll just stand there.

For some reason, even after combat they continue to be stuck in place for a while. Perhaps they're simply stuck in a loop, unaware combat has ended - perhaps they believe they're in combat for another reason, or perhaps it's an across-the-board glitch with movement code. They will appear to be out of combat (sheathed staff and all) but will not even with a corpse in range. Nothing to do with loot codes; after getting 'unstuck' they will first follow then proceed to loot the corpse.

blueboy commented 12 years ago

Hi,

I think I might know what is causing the apparent lockup during combat. I know, your going to blow this idea out of the water, by saying you don't use it --- Combat Delay. It seems to me that this feature only has worth in attacking a single target. If the bots delay for every target in a mob, you can quickly see that the will become useless. I'm currently seeing whether I can aggravate this issue by setting large delay for each bot. If this is the cause, I propose to measure the size of the mob (m_attackerInfo) and only allow a delay in combat, if it's a single target i.e. m_attackerInfo.size() < 1

Interested in your comments

kennumen commented 12 years ago

That sounds as good a place to start as any. In a way, they are being 'delayed'. It's a bit strange it would only affect spellcasters (a healer who only uses the staff is thus not affected), but at this point we're well past strange ;)

Okay, I think we'll be talking next to each other with the use of 'mob'. AFAIK in WoW terminology a mob is a single PvE target (it doesn't make sense but keep in mind not all 10+ million players use english as a first language). For clarity in this post I'll use target and targets to differentiate, rather than the obviously ambiguous 'mob'.

Combat Delay should not only be used when the group of targets consists of one target. That could cause logical problems anyway. I can see a check like that failing on a group of 3, but when that group is whittled down to 1 target have it kick in. And a few other problems. What I propose is we redefine 'combat' in a way unlike what we use for 'FirstCombatManeuver'. This function 'resets' each time it acquires a new target. What we need for combat delay is a 'just started combat' rather than 'switched target' check, so that it always runs on the first target whether the group of targets is 1, 5, or 1 + aggro (during combat) + aggro + aggro.

For what it's worth, I do use combat delay, it's just set to 0. Which in theory should mean it can't be the problem but in practice... Well, I think we both know better ;)

blueboy commented 12 years ago

Hmmm,

useful to know that your getting the issue with combat delay at zero. There is obviously more to this than I thought. As you know I'm a complete novice at WOW and the use of game play terminolgy is mystery to me. ;)

side note:

I have found a fix for autoequip not working properly. I'm not sure whether you noticed this but frequently bots would have items in their inventories that would be better equipped. There are a few logic errors in the item comparison code that was causing this. I'll continue testing what I have and if it checks out, I'll post it shortly.

Cheers

kennumen commented 12 years ago

Honestly the way (both) my parties are built, I can't play while this bug exists - save for testing - so I'm stuck at levels 6 and 4.

Combat delay is really only useful for difficult encounters. Specifically in instances (depending on gear and how many you'll aggro - but always on bosses). It definitely needs tweaking still - a healer with say 3 combat delay (which I believe is default) will effectively NOT heal during those 3 seconds. 2 crits on the tank from a boss and he's a goner in that time :) (... probably exaggeration but still - not ideal). Not really the place to discuss this but I would consider limiting healing delay to 1 maximum ever and use combat delay solely for damage spells - for anyone designated as CO Healer.

OT: I try to remember not everyone in MaNGOS knows the blizzWoW terminology. Sometimes I forget, sometimes I go overboard :)

sidenote: Can't really test autoequip, don't play enough (feels pointless when half your DPS is malfunctioning). I did notice some strange stuff when I was refactoring or something - like the Death Knight being designated as a hybrid class (as far as gear score is concerned I think it's identical to a DPS warrior). Pretty sure I didn't mess with it then - not atop my list of priorities yet :)

Thanks again for taking a look at this bug. I'm hoping it's gone by sunday when I have visitors :) (no pressure)

blueboy commented 12 years ago

I've been trying to find a solution to the current bot control issues. In brief the bots were becoming unresponsive in combat for an indefinite period (normally until combat finishes).

My first task was to determine if a recent change in the code was responsible for this and I traced the issue back to the changes made in handling DoFirstCombatManeuver() and DoNextCombatManeuver().

https://github.com/blueboy/portal/commit/9f5e330c79716058ceb401ada8b201e4dea43b1b

I appreciate the effort that has gone into these changes. Many commits have since been applied to 'new-ai' branch that depend upon this code, so reverting the changes made was not a option.

The original ClassAI functions used a simple boolean return, but the new CombatManeuverReturns are more complex.

RETURN_NO_ACTION_OK // No action taken during this combat maneuver, as intended (just wait, etc...) RETURN_NO_ACTION_UNKNOWN // No action taken during this combat maneuver, unknown reason RETURN_NO_ACTION_ERROR // No action taken due to error RETURN_NO_ACTION_INVALIDTARGET // No action taken - invalid target RETURN_FINISHED_FIRST_MOVES // First-combat-maneuver finished, continue onto next-combat-maneuver RETURN_CONTINUE // Continue first moves; normal return value for next-combat-maneuver

This was a massive undertaking for anyone to do in a single commit, so hats off to the kennumen. I have done my best to follow the logic and relate these new values to the old 'true' and 'false'. If I am correct all relate to a false condition, except for RETURN_CONTINUE. I found a small discrepancy in the Rogue AI that was preventing stealth attack from working properly, but the rest seem to check out O.K.

I was concerned that the code added to find newtargets might contribute to this apparent lockup. However after testing I found this not to be the case. I have also reorganised the code with DoNextCombatManeuver() in PlayerbotAI.cpp and this does appear to help. I have pushed the changes to 'new-ai' for further testing, so let me know if it helps ;)

kennumen commented 12 years ago

First off, it goes unsaid too often so let me make a point of it here - thanks for debugging this. It's somewhat shameful of me to push something broken and ask someone else to fix it. Logical errors (which it looks like this is) are the hardest by far to fix. In short, your efforts are much appreciated.

I've been looking at your pushes and two jump out at me: https://github.com/blueboy/portal/commit/184c58993f29333884ecba304bf6cb317221c553 https://github.com/blueboy/portal/commit/9f5e330c79716058ceb401ada8b201e4dea43b1b

In the first push it looks like you disabled some of the CombatManeuverReturns functionality. The second push looks... It kinda looks like what I pushed to enable CombatManeuverReturns in the first place. Needless to say I'm rather confused. I won't be able to test until friday though, some testing should prove enlightening (I'll see if I can't slot it in sooner).

I was hoping the comments for these return values proved self-explanatory. As far as they relate to FirstCombatManeuver, "RETURN_CONTINUE" would be the only case to continue calling FirstCombatManeuver next rather than NextCombatManeuver.

It should be a TODO of mine to remove "NO_ACTION_UNKNOWN" everywhere. That's got to be the worst coding I've ever put to use :)

blueboy commented 12 years ago

I'm a little confused myself.

I've been looking at your pushes and two jump out at me: https://github.com/blueboy/portal/commit/184c58993f29333884ecba304bf6cb317221c553 https://github.com/blueboy/portal/commit/9f5e330c79716058ceb401ada8b201e4dea43b1b

The first push you cite is indeed mine, but the second is yours. It's not surpising that you say

The second push looks... It kinda looks like what I pushed to enable CombatManeuverReturns in the first place.

With the first push I'm not sure how I've disabled the functionality of the CombatManeuverReturns.

DoFirstCombatManeuver() changes:

I grouped all the CombatManeuverReturns together where (m_targetChanged = false) & kept RETURN_CONTINUE separate (m_targetChanged = true). The logic should be the same as the original.

DoNextCombatManeuver() changes:

It is obvious that you haven't implemented any of the return conditions here. I didn't change anything, I just grouped all the CombatManeuverReturns together, all using the default return condition.

The only real change made in the first push was that for the Rogue AI, so stealth attack would work correctly

kennumen commented 12 years ago

Test with warlock previously faulty, works perfectly. Looting is near-instant as well - not entirely realistic but very convenient - and more fun, I'd dare say.

I'd like to keep this issue open a smidgen longer; I ran into an issue that causes my ".bot add druid" to crash the server (if not immediately, then once it teleports near me). Didn't see any changes to the druid files so I suspect this is just a local problem, but would like to test properly tomorrow to confirm.

blueboy commented 12 years ago

Great to hear. I have posted a tool on the playerbot forum that maybe useful in debugging lockup/bottleneck in the future.

http://playerbot.mine.nu/index.php?/topic/63-debug-tools/page__pid__309#entry309

I found that my server crashed when I summoned a druid bot, but I was focused on resolving this lockup issue. I'm sure we can fix this too...

@EDIT on the druid crash issue. This is caused by

if (HealTarget(GetHealTarget()) & (RETURN_NO_ACTION_OK | RETURN_CONTINUE))
    return RETURN_CONTINUE;

The function HealTarget has no trap to handle the (Unit* target) having a NULL value. Could you explain why your using a conditional parameter for HealTarget

If Unit* target

(GetHealTarget()) = 00000001 bitwise (AND) & RETURN_NO_ACTION_OK = 00000001 or RETURN_CONTINUE = 10000000

result = 00000001

if no Unit* target the result = 00000000

So a boolean value will be passed to HealTarget as a parameter, but this is how it's defined

CombatManeuverReturns HealTarget (Unit *target);
kennumen commented 12 years ago

That's a great catch (I'm glad I at least had the good sense to only use the new GetHealTarget code for one class). I have fixed the error you pointed out and pushed the code.

I'm not sure why that if confuses you... First "GetHealTarget()" gets resolved. This is a Unit* pointer which is either a pointer, or null. This gets processed by HealTarget() and returns a CombatManeuverReturns. Next, RETURN_NO_ACTION_OK and RETURN_CONTINUE get bitwise ORed. From PlayerbotAI.h we can see that this results in (0000 0001 | 0010 0000) in other words 0010 0001. This value is then bitwise ANDed with the return value of HealTarget, another CombatManeuverReturns. Finally this value is then boolean checked for the if (either 0 being false, or either RETURN_NO_ACTION_OK or RETURN_CONTINUE - 00000001 or 00100000 being true).

In short, if HealTarget(target) returns either RETURN_NO_ACTION_OK or RETURN_CONTINUE, this check returns RETURN_CONTINUE, telling the bot to skip this 'tick' because it's busy healing.

I hope that answered your question.

Sidenote: I did another small test stretch (with the full 5 man team), everything seems to be in order - great even. Don't know if you read my post on the playerbot forums, but I feel this would be a good time to 'freeze' this code into the beta (being portal).

blueboy commented 12 years ago

Yeah,

Sorry I missed your post on the fourm, but have just replied :| Ah yes, sorry I can see now my mistake. I read the parameters as

     HealTarget(  (GetHealTarget()) & (RETURN_NO_ACTION_OK | RETURN_CONTINUE) 

but I now realise it's

     HealTarget(  GetHealTarget()  ) & (RETURN_NO_ACTION_OK | RETURN_CONTINUE)  

After a solid week of debugging I feel like the guy in the film 'Scanners' BANG! , I need to take a break from the PC ;)

Cheers

kennumen commented 12 years ago

That movie's a bit before my time. I did see a remake/spinoff/ripoff more recently that sounded similar but wasn't horror, just (trying to be) a thriller. B-movie at best though, but the concept had potential.

Anyway, thought that might've been what you misread. With that sorted I declare this issue closed. Off to create a new one :)