MegaMek / mekhq

MekHQ is a java helper program for the MegaMek game that allows users to load a list of entities from an XML file, perform repairs and customizations, and then save the new entities to another XML file that can be loaded into MegaMek.
http://megamek.org
131 stars 168 forks source link

[0.50.0-SNAPSHOT] Refitting through MekLab within MekHQ freezes application. #4554

Open Tzahr opened 1 month ago

Tzahr commented 1 month ago

Environment

17:25:38,461 INFO [mekhq.MekHQ] {main} mekhq.MekHQ.initializeLogging(MekHQ.java:303) - Starting MekHQ v0.50.0-SNAPSHOT Build Date: 2024-08-04T01:23:25.882857488 Today: 2024-08-05 Origin Project: MekHQ Java Vendor: Eclipse Adoptium Java Version: 21.0.3 Platform: Windows 11 10.0 (amd64) System Locale: en_GB Total memory available to MekHQ: 8 GB MM Code Revision: 0f0d10d35f6bbc190c42be24563140865de228c5 MML Code Revision: 35aa5ce66ca59272c0d1e05c3d174cd4c0335b3c MHQ Code Revision: 76e9cea125956ab9a9d8af389ee60c2bb2eca81d

Description

Loading a mech into the MekLab in MekHQ, creating a custom refit, and applying it causes the application to freeze up. Curiously, after force-closing the application and reloading it, the custom variant does appear as a refit kit, meaning it does save the mechfile without issue.

Files

mekhq.log

repligator commented 1 month ago

See also

4072

I've experienced this exact issue (freeze, but the refit kit is still available) multiple times in 49.19 and 49.20, but I haven't encountered it yet in 50.

Tzahr commented 1 month ago

It happens pretty much every time I do a refit even in the Nightly, and it's been happening to my fried also, unfortunately.

Sleet01 commented 3 weeks ago

@Tzahr Could you post a campaign file and zipped custom mechs folder so we can try to repro the issue with your setup?

Tzahr commented 3 weeks ago

I've updated Nightly a few times since the initial report; I have not been refitting a whole lot, so I tried it again on the same save to check if this was still relevant, and lo and behold, it now appears to function just fine the few times I tested.

I'll tinker a bit, and should it happen again, I'll report back with the specifics. Until then, though, this appears to have resolved itself, either on my end, or through a knock-on effect.

David-Beverley commented 2 weeks ago

I was able to reproduce this a couple times on a recent nightly, not sure exactly which day. Within the last week. The freeze triggered when attempting a refit immediately after ending a scenario with stratcon. The unit was still deployed?

The odd part was, upon reloading the save I took immediately after the MM game ended, the freeze did not occur even with no other changes. My only thought is that the act of running a megamek session and then starting a refit in the same MekHQ session is triggering the freeze somehow. It would explain why its hard to reproduce, as any sent save file wouldn't experience it unless you ran a scenario.

I got it to happen twice with the same steps. Get scenario > Run mm game > Save > Refit > Crash > Load > Refit > No Crash

@Sleet01 My discord tag is garnathor if you want anything additional from me. I'm pretty spotty on github! @Tzahr If you get a chance could you try to repeat with the above steps as well to make sure its not just a me thing?

08:48:21,505 INFO [mekhq.MekHQ] {main} mekhq.MekHQ.initializeLogging(MekHQ.java:315) - Starting MekHQ v0.50.0-SNAPSHOT Build Date: 2024-08-22T01:20:33.550671310 Today: 2024-08-28 Origin Project: MekHQ Java Vendor: Eclipse Adoptium Java Version: 17.0.12 Platform: Windows 10 10.0 (amd64) System Locale: en_US Total memory available to MekHQ: 4 GB MM Code Revision: 050762346edc19ae2dc6ec5c088665de612a3cde MML Code Revision: fc321e451b068ba6292e43ecc1f6ce80345d79cb MHQ Code Revision: dfa62d0446dd0544c2c3daae4cbb751d0a643224

LadyAdia commented 2 weeks ago

More data points for the issue. Nightly CI1600. I'm including my files to see if it'll help reproduce the issue. Of note: if I customize shortly after starting a play session, things work fine. It's only after playing for some time that the freeze will occur.

Upon reviewing my log file, the final error implies the JVM was still trying to process the refit, but I had let things sit for a few minutes before force-closing and giving a time-out error. Ultimately, if it's only a slow-down, its long enough that most users will likely give up rather than wait. MHQ Refit Freeze Error CI1600 24-08-31.zip

LadyAdia commented 2 weeks ago

I'll also note this has been a running issue for me for a while across multiple versions. I'm running 2GB of memory on my laptop and assumed until now it was a memory issue and thus not something with the program itself.

Sleet01 commented 2 weeks ago

After some analysis, we've determined that this is not a blocker, but it is a very bad user experience.

After saving the refit target build's information to disk, we're having MHQ invalidate and rebuild the entire unit cache in order to reload and validate the just-written file.

This is a slow process, normally run in a separate thread to avoid making the GUI unresponsive, but here it's running in the main thread so the whole program blocks until the cache is rebuilt.

We'd like to replace this call with a new cache refresh call, one that can be instructed to load a specific file, in order to remove this slowdown. However, there isn't time to write and test this new code before the next release, so we're targeting 0.50.1

In the meantime, there are few options for players who experience long pauses after starting a refit from the MHQ MekHQ lab tab:

  1. Use MML to create refit mtf files, then apply them from the MHQ hangar. Thus may require manually refreshing the cache in MHQ first, in order to make the new refit visible but this should be nearly instantaneous.
  2. Time how long it takes for MML to completely rebuild the cache from scratch when first starting a new install; use this time as the expected time for a refit to finish loading. Not ideal, but the times should be the same.

Unfortunately this issue is exacerbated by any states that make a cache reload slower: slow drive, slow processor, low memory all contribute to the time taken. So low-power laptops are hit the hardest.

OrbMonky commented 6 days ago

I have been lucky so far, but I had one happen today on 0.50.00 (windows 10/java 17).

image image mekhq.zip

I had the client open overnight (maybe longer), but it wasn't affecting performance. I had run repairs that day which may have affected 'mech tech repair time availability, possibly relevant given the error.

The refit in question was a Shugosa LoaderMech being refit to remove half a ton of armor and add a mine layer (updates armor values at 2 on every location except 3 on head and 3 on CT front) in the CT, moving the Searchlight to the Head location. May help in reproduction.

When the crash occurred it closed the MekHQ application in the task manager but it remained visible on screen, Java remained open and locked up with 9.3 gb of ram siezed and had to be force closed.

OrbMonky commented 2 days ago

I've been having them more frequently now, even boosted my ram up to 16gb for mekhq but it didn't help. The only other lead I've noticed is they've happened when I pulled a unit into the HQ's meklab for customization, then went to other tabs and checked things. If I stay within MekHQ Meklab for the customization process it seems to be stable.