i7 / counterfeit-monkey

Counterfeit Monkey by Emily Short
http://ifdb.tads.org/viewgame?id=aearuuxv83plclpl
Other
168 stars 30 forks source link

FYI: Startup precomputation ineffective in releases 10 and 11 #216

Closed hanna-kruppe closed 1 year ago

hanna-kruppe commented 1 year ago

Summary

The startup data (save) embedded in CounterfeitMonkey-10.gblorb and CounterfeitMonkey-11.gblorb do not match the game file, so restoring from it always fails silently [1] and startup computations run in full. Then, the Startup Precomputation extension writes out Counterfeit Monkey-startup-data.glkdata to try and accelerate future launches but unfortunately never uses it [2] so the full startup computation runs every time. Releases 7-9 don't seem to be affected (I didn't check even earlier releases).

Details

[1] The save file format includes a copy of the first 128 bytes of the game file (the .ulx, not the .gblorb) that produced the save. These 128 bytes include a header with important data for the Glulx VM and some other metadata stored by the Inform compiler, and if they differ from the game trying to restore, then interpreters "should" refuse to load the file. I've not poked too deeply into the differences, but I did see that several important Glulx VM parameters (ENDMEM, RAMSTART, and the string decoding table address) differ. That probably means the mismatch isn't a benign difference that a less strict interpreter could ignore to successfully restore from the file.

[2] After starting up the hard way, the extension writes an external save game (Counterfeit Monkey-startup-data.glkdata) that is intended to accelerate future launches (or be embedded into the blorb manually). But it never actually tries to restore from that file on future launches, because the restore portion of the extension only looks for that file if it can't find startup data in the blorb. This means an interpreter using a Glk library that doesn't support the "resource stream" feature actually starts up quickly after the first time, while "modern" implementations that support it always start up the slow way. In fact, they're slightly slower than if the game wasn't using the extension in the first place (since trying to restore, and later writing out the save, takes a bit of time).

Affected Interpreters

I think this should affect every interpreter that does basic sanity checks when restoring. I've confirmed it in several ways in Glulxe, and launching the game in Quixe also creates a startup data file in local storage. Although I don't see the console log message I'd expect. I've not tried it with Git but I see its source code has (always had) the same check.

Reproducing

Showcasing the root cause and performance effect described above is a bit tricky. Since a failed restore doesn't usually output anything (the usual "restore failed" message comes from Inform library code, not from the Glulx @restore opcode) only way to really see that restoring fails is to run the game under a Glulx debugger or use an interpreter that outputs a warning when it can't restore form a save file (Quixe seems like it does that, but I don't observe it in my tests -- I did modify Glulxe to do this, tho). The performance effect can be seen by using blorbtool.py to remove the embedded save (delete Data 9998) and noticing/measuring that it starts up significantly faster than the unmodified blorb when a startup data file already exists.

But testing if the issue persists after some changes is simple (with Glk implementations that put the startup data file in some predictable place, like the current directory). Simply remove the existing startup data file if it exists, start the game, and check if it (re)created a startup data file.

Fixing

The simplest way to improve on this situation would be to not include any startup-data file in future release blorbs. The first startup would always be slow (as it was in the past two releases), but after that, Startup Precomputation could do its thing (unlike today). Including a "correct" startup data file in future releases would of course be better, but I assume there's a reason why this happened -- and it's hard to notice this problem if you don't go out of your way to look into it.

Besides changes to Counterfeit Monkey, it might be helpful to extend Startup Precomputation (cc @curiousdannii) to try loading the on-disk startup data even if startup data was found in the blorb, but restoring from it failed. Or at least print a warning in case restoring fails, to help game authors notice the problem.

angstsmurf commented 1 year ago

Is this caused by c431c2e?

hanna-kruppe commented 1 year ago

Seems likely, although that commit was apparently made after release 10. But aside from that, yeah, not regenerating the startup data file right before embedding is in the blorb is the most plausible way to cause this issue.

angstsmurf commented 1 year ago

Thanks for reporting this! That is annoying.

I was probably using the modified script for release 10 before committing it. I'll try to make sure it is fixed in the next release.

angstsmurf commented 1 year ago

d42038e at least makes sure that there is a warning printed if the game cannot restore from the embedded data.

Note to self: The reason we can't just start the game in any interpreter and embed the resulting Counterfeit Monkey-startup-data.glkdata, like the insert.sh script did before c431c2e, is that the game will start with graphics off unless the startup data comes from an interpreter with graphics support. This may sound like an easy thing to fix or work around, but it isn't, partly because the graphics off flag is @protected in order to survive restarts and restores.

curiousdannii commented 1 year ago

Is the graphics setting read from a file? If so it should be re-read after a restore (which the startup computation is).

hanna-kruppe commented 1 year ago

I'm very much clueless and interjecting from the peanut gallery here, so don't mind me wondering out loud, but I don't understand the point about the graphics off flag being protected. If the memory location is protected before the restore-from-startup-data, then it would keep the value it was set to before the restore (however that happened, but it would come from code running in the present, under the player's interpreter). If it's computed after the initial startup, there also shouldn't be any issue. The only way I can make sense of this is if the graphics off flag is set during the part of startup that's skipped via startup precomputation, but only protected later, after the startup data is saved/restored. Which seems... possible, but rather strange and easy to fix (by moving the @protect next to the code that sets the flag) 🤔

I should note I've never used @protect myself, only implemented it in an interpreter, so my understanding of how it's "usually" used could be warped.

angstsmurf commented 1 year ago

The graphics_disabled array is defined here: https://github.com/i7/counterfeit-monkey/blob/4c6638539e21de32b6fdda36a97b0e2a5081e311/Counterfeit%20Monkey.materials/Extensions/Counterfeit%20Monkey/Multimedia.i7x#L748

As far as I can tell, it is only modified by the enabling graphics and disabling graphics actions, i.e. when the user types MAP OFF or MAP ON, so I'm not sure why it is turned off when the game is started on an interpreter without graphics support, but I haven't studied the code very closely.

We want it to be protected, because it is annoying to turn off graphics only to have it turned back on by restarting or loading a save from another interpreter.

If you know of a way to fix this, please let me know.

hanna-kruppe commented 1 year ago

I've confirmed experimentally that the @protect opcode is not executed when starting up release 11 normally under cheapglk (also not when typing map on / map off, since that is rejected earlier on in the respective rule), so I believe you're right that the graphics flag is not touched anywhere else. I am not familiar with the game's code or the extensions it uses, but after skimming some of the code my best guess of what's happening is:

I was going to suggest "try opening the graphics window explicitly after Startup Precomputation is finished" but I see this rule already tries to do that:

https://github.com/i7/counterfeit-monkey/blob/4c6638539e21de32b6fdda36a97b0e2a5081e311/Counterfeit%20Monkey.materials/Extensions/Counterfeit%20Monkey/Multimedia.i7x#L58-L65

Indeed, the map isn't opened on startup, but only after the first couple of prompts -- I had forgotten about that after spending so long trying to get the game to start in a cheapglk-based interpreter :sweat_smile: However, note the condition about "the width of the measuring window" -- if that window wasn't opened during startup and wasn't opened by any other code, then the flexible windows extension will report a width of 0. So perhaps the fix is to open (and later close) the measuring window in the above rule, as is already done in the enabling graphics action.

hanna-kruppe commented 1 year ago

Digging a bit in the history of that code led me to discover this code which seems like it should do what I suggested as fix:

https://github.com/i7/counterfeit-monkey/blob/4c6638539e21de32b6fdda36a97b0e2a5081e311/Counterfeit%20Monkey.materials/Extensions/Counterfeit%20Monkey/Presentation%20Details.i7x#L802-L825

However, it's using a hook of the standard Inform rules, so I think it won't be triggered by Startup Precomputation, which uses bare @save and @restore opcodes.

angstsmurf commented 1 year ago

You are absolutely right. Re-opening the measuring window if the width of the measuring window returns 0 fixes the no-graphics problem. It has nothing to do with the @protected stuff, that was a red herring.

Implemented in 69d5016a774680b5521414adc7371d3f385c9f70. The original functionality of insert.sh is re-instated in 31111c33b5949e590cec14e4241179e20464f88a.

Thanks a lot for the analysis and the fix. As the original author of this code, I really shouldn't have been this confused about the cause of the problem, but I wrote it a long time ago.

hanna-kruppe commented 1 year ago

Glad to hear that it works now! I found this issue by starting Counterfeit Monkey in the Glulx VM I'm working on, and the game also found a couple of bugs in my VM at the same time, so I think we're even :)