ISeeDEDPpl / Questor

Questor
http://www.thehackerwithin.com
Other
47 stars 28 forks source link

Crashes on cleanup stage (Stacking items hangs?) #312

Closed bugc closed 11 years ago

bugc commented 11 years ago

About 3 recent builds I encounter crashes after 1 to 3 hours of running 2 instances (or even one). Usually it happens on Cleanup stage - this one is shown when client stops. Here is the screen: http://imageshack.us/photo/my-images/842/181212b.jpg/

As far I could find out connection, Questor hangs in dead cycle Stacking items. Perhaps this is 2d issue? May I point to time it was trying each second to perform task - 8 minutes - and this one is just sample I found for example. « ... 01:36:15 [Cache.StackLootHangar] Stacking Item Hangar 01:36:16 [Cache.StackLootHangar] Stacking Item Hangar 01:36:17 [Cache.StackLootHangar] Stacking Item Hangar 01:36:18 [Questor] Wallet Balance Has Not Changed in [ 8 ] minutes. ... 01:36:34 [Cache.StackLootHangar] Stacking Item Hangar 01:36:35 [UnloadLoot.MoveLoot] Loot was worth an estimated [0] isk in buy-orders »

I need guidance what data to collect to report properly.

ISeeDEDPpl commented 11 years ago

make sure you are running latest experimental and report back. there have been updates in bleedingedge for a few days now that fix this. I just merged those fixes into experimental.

bugc commented 11 years ago

I was using 'bleedingedge' at time this happened for sure. Today I switched to experimental. There is no windows' message, but Questor hanged deadly "Not Responding". When I closed Q's window same system message appeared - CCP ExeFile stopped working. This happened just in first hour. 'Unload Loot' was last state shown by Questor window before I closed it. UPD. crashed with system window in 2 hours later;

Unfortunately Stacking issue is permanent too. « 22:06:28 [UnloadLoot.MoveAmmo] Moving [2] Ammo Stacks to AmmoHangar 22:06:29 [UnloadLoot.MoveAmmo] will Continue in [ 2 ] sec 22:06:30 [UnloadLoot.MoveAmmo] will Continue in [ 1 ] sec 22:06:32 [StackItemsHangarAsAmmoHangar] test 22:06:32 [UnloadLoot.MoveAmmo] Stacking Item Hangar 22:06:34 [UnloadLoot.MoveAmmo] Done Moving Ammo 22:06:34 [StackItemsHangarAsAmmoHangar] test 22:06:35 [Cache.StackLootHangar] Stacking Item Hangar 22:06:36 [Cache.StackLootHangar] Stacking Item Hangar 22:06:37 [Cache.StackLootHangar] Stacking Item Hangar ... 22:09:21 [Cache.StackLootHangar] Stacking Item Hangar ... // here I stopped Questor »

juandrito commented 11 years ago

I have the same error: [Cache.StackLootHangar] Stacking Item Hangar 07:09:49 [Cache.StackLootHangar] Stacking Item Hangar 07:09:50 [Cache.StackLootHangar] Stacking Item Hangar 07:09:51 [Cache.StackLootHangar] Stacking Item Hangar 07:09:52 [Cache.StackLootHangar] Stacking Item Hangar 07:09:53 [Cache.StackLootHangar] Stacking Item Hangar 07:09:54 [Cache.StackLootHangar] Stacking Item Hangar 07:09:55 [Cache.StackLootHangar] Stacking Item Hangar

Blafasl commented 11 years ago

I have/had the same error but didn't test questor the days after that fix. I guess when you open the innerspace console in the hanging eve instance you see a "system.accessviolationexception" occuring every time questor tries to stack the Item hangar. When the eve client crashes you can see in the event viewer an error that implies the same exception.

Just an idea, do you guys have an Xeon processor? Even though the RAM should be okay for everyone (i tested 12 hours with memtest86+ with no problems) i guess it maybe has an hardware problem or a windows service is interfering.

bugc commented 11 years ago

Xeon? hmm... Physics say NO (http://en.wikipedia.org/wiki/Xeon) - can't place Xeon in my Workstation. I'm running budget E6500 dual core, refer to http://en.wikipedia.org/wiki/Wolfdale_(microprocessor) for details. BUT why do you suspect Xeon to be reason? It wasn't for several builds before but became - why?

My glue could be same kind of mystery but I believe something is going wrong with memory as well , like disposing object in wrong thread or for wrong Q instance. I realized that single Q session is lasting reasonable long (like overnight, tested for several times). But two instances are crushing too fast - one-three hours likely.

So I suppose it isn't exactly Q's issue but in co-operation of ISX-(DE+Q)

Blafasl commented 11 years ago

Well, a light guess from me was that a little function in Intels vPro Tech could somehow interfering with Questor/Directeve. But yeah, the problems seems to be somewhere in between. But i am wondering why is it running for some and some people do get those confusing crashes? We all use more or less the same OS (Windows 7 x64) only the underlying hardware is vastly different ranging from AMD stuff to Intel/Nvidia. But yeah... i am running out of ideas on how to fix it. It seems that this behavior can even "survive" a Windows reinstall which makes it a real pain to find the cause of it.

bugc commented 11 years ago

I do not think it isn't possible to work around. When I see [cache.stackloothangar] cycle I just turn manualy Q's state to CleanUp and everything becomes running. The code could be fixed timely by just breaking cycle after 30 sec or less. I suppose, I do not know where this cycle is.

Blafasl commented 11 years ago

I would feel better if we could find out why it crashed/makes weird things for some of us and for some others not. I got to test some bots of mine and still get the same crashes/hangs as you guys are experiencing.

bugc commented 11 years ago

Let's track application size when crashed. I realized that Q's no more restarts because of memory leaks, but we have current issue. May be the issues are linked each other i.e. memory leaks were fixed with side effect?

Blafasl commented 11 years ago

I got a little progress. I have no idea of C# (just a little bit of Siemens S7) so am really happy that i found that

It seems weird to me that Questor is not writing in the Console Log with the Exception that occurred as if it would just ignore the "try ... catch" routine and well, it really is. I attached VS on it and set a break point on the catch line at around 2834 in cache.cs but it never braked there. After a little research i found this: http://msdn.microsoft.com/en-us/library/dd638517.aspx The cause was that it needs the following stuff added in the app.config of Questor: http://pastebin.com/7pZy5eap It seems that .Net 4.0 changed the handling of exceptions and ignores the try-catch loop unless that legacy thing above is specified. But sadly it is still hanging in the loop. The new console log is: "........ 18:06:30 [Cache.StackLootHangar] Stacking Item Hangar 18:06:30 [Cache.StackLootHangar] Stacking Item Hangar failed [System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. at ..(IntPtr , String ) at ..( , String ) at DirectEve.DirectContainer.StackAll() at Questor.Modules.Caching.Cache.StackItemsHangarAsLootHangar(String module) in c:\UsersMyWindowsLoginNameRedacted\Documents\Questor\Questor.Modules\Caching\Cache.cs:line 2829] 18:06:31 [Cache.StackLootHangar] Stacking Item Hangar 18:06:31 [Cache.StackLootHangar] Stacking Item Hangar failed [System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. at ..(IntPtr , String ) at ..( , String ) at DirectEve.DirectContainer.StackAll() at Questor.Modules.Caching.Cache.StackItemsHangarAsLootHangar(String module) in c:\UsersMyWindowsLoginNameRedacted\Documents\Questor\Questor.Modules\Caching\Cache.cs:line 2829]"

ISDP, if you want to get creative with debugging and you need me to do something, i would be happy to help. Maybe it is possible to set a boolean when the catch is hit so that questor could ignore the LootHangar as long it is set to true and of course it resets when a new mission starts. It maybe not a nice solution but it could work.

ISeeDEDPpl commented 11 years ago
I'll add a counter and have the counter reset every 10 min or so, which will allow it to not hang... and give us some indication that we need to likely restart questor
ISeeDEDPpl commented 11 years ago

this should now be fixed, please re-open if you see this occurring with bleedingedge.

bugc commented 11 years ago

Hi , I've tried current build today. Unfortunately the fix isn't good - no way to take over when the issue happens. In previous logic it was enough to change state to 'Cleanup' or 'DelayedStart' to break clinched state. Now the trick doesn't help. I started Q as usual, state sequence was: GotoBase, Arrived, UnloadLoot. Nothing changed for several minutes and Q restarted it selves. Overriding state to 'Claenup' or 'DelayedStart' brings the same issue. So the dead cycle becomes permanent.

ISeeDEDPpl commented 11 years ago

ok, in theory its fixed. I have not tested it =/

try latest bleedingedgge and post a log if its not yet working please.

bugc commented 11 years ago

in practice: Q stunned once on 'Switch' state, twice on 'Arm' (1st time I override by manually loading ammo and push to 'UndockCheck') - these were just from start of client; On second account 'Arm' passed but when returned back from mission hanged on 'Unloadloot'. Logs... like this: 01:50:59 [Arm] Begin 01:54:17 [Questor] Wallet Balance Has Not Changed in [ 2 ] minutes. 01:55:18 [Questor] Wallet Balance Has Not Changed in [ 3 ] minutes. 01:56:18 [Questor] Wallet Balance Has Not Changed in [ 4 ] minutes. 01:57:19 [Questor] Wallet Balance Has Not Changed in [ 5 ] minutes. or 01:56:06 [CombatMissionsBehavior] UnloadLoot: Begin 01:58:39 [Questor] Wallet Balance Has Not Changed in [ 2 ] minutes. 01:59:39 [Questor] Wallet Balance Has Not Changed in [ 3 ] minutes. 02:01:40 [Questor] Wallet Balance Has Not Changed in [ 2 ] minutes. MUST add that no ammo was moved from cargo. usually it was moved before enter dead cycle please think of what kind of logs will help you.

bugc commented 11 years ago

ISDP, I'm on 'experimental' now, testing if I described situation properly. In fact this build hangs after all items were stacked after unloading. That means you were right to apply counter to break cycle but probably in wrong place. Wallet balance watchdog is working within StackLootHangar cycle. Maybe it's enough to use these two conditions to flag for 'Cleanup' state change? Cleanup does help in cache clinch. For me this issue is ass pain it fires too often. I guess why it does not affect much ppl here (or it does?). Probably the reason is in network conditions? I realised delays in booting to EVE in last days they were because of DE server, probably because my route to server. If so could it be same reason for cache denies, what do you think?

ISeeDEDPpl commented 11 years ago

bleedingedge should now be working, the last issue was a timestamp problem that has been corrected. Please confirm.