[DIRTY DRAFT] AGS 4: Watch game variables from the Editor

ivan-mogilko commented 1 month ago

WARNING: a dirty draft, contains ugly code and non-final data serialization, NOT FOR MERGING. UPDATE: started a cleaner branch here: https://github.com/ivan-mogilko/ags-refactoring/tree/ags4--memwatch

This implements an mechanism for retrieving contents of engine memory via a debugger communication interface. Adds the "Watch variables" panel in the Editor, which lets user type in variable names, and receive current values.

A tall screenshot under the spoiler: **CLICK HERE**

![ags4-memorywatch-draft4](https://github.com/adventuregamestudio/ags/assets/1833754/20b6007b-a1a3-4fb6-825a-683faa22334f)

What works:

Global variables, including ones imported from other scripts, plugins or the engine itself;
Local variables, including function parameters. Their lifescope should be detected correctly.
Resolves any chain of struct members and pointers access.
Accessing array elements using [n] notation.
Managed pointers are displayed as handle values; this is not entirely useful, but lets to see if they are null, and compare among themselves.
Special hardcoded treatment for managed String types which resolves them to the internal char buffer.

What does not work:

Attributes (aka properties). Attributes in AGS are secretly pairs of get/set functions. Reading an attribute's value means calling a getter function. There are two major issues with that currently:
- First, such call may have side effects, including but not limited to creation of a managed object or returning a managed object with incrementing its ref count;
- Second, if an attribute is user-defined, then this would require to run a script. But engine currently does not support running more scripts while in a "break" state. So, this is something left for the future consideration.
Displaying full contents of a struct's or managed struct's object (or array, fwiw) after typing its name. I suppose that is theoretically possible, but will not do this at the first stage. For now you'll have to type each struct's member separately.

How is it done

While compiling the script, compiler (optionally) builds a "table of contents" of all script's variables, global and local, and own functions. This table of contents is saved as a (optional) compiled script's extension, similarly to the previously added RTTI table.
Implemented two new debugger commands: "GETMEM" and "GETMEM2" (i was not creative), which request a memory contents from the running engine.
The first "GETMEM" command has an argument containing direct access instructions in a certain notation: x[N]:offset[,type[:offset,type[:...]], where x is a script's memory designation (e.g. g for global script, r for current room script, m2 for script module under index 2, and so on), offset tells relative memory address in bytes, and type explains how to resolve this address (e.g. i1 for int8, i2 for int16, i4 for int32, h for managed handle, etc). I implemented this command first, because it allows to get memory without even knowing variable names. But of course it's quite inconvenient for common users.
The second "GETMEM2" command has an argument containing variable's name, or chain of access of any complexity, e.g. mystruct.member_field.internal_field etc. In order to process this command and correctly resolve all memory addresses engine requires "table of contents" from the current script and RTTI. If these are not available, then it will fail. If they are present, then engine builds a memory access instruction similar to "GETMEM" command, for itself, and then carries on with its processing.
Engine parses either of these 2 instructions, and tries to resolve the requested memory address. On success it sends a "REVMEM" command back to debugger, with value and type as simple string arguments.

Other notes...

Global variables need only their address in script's global memory, name and type.
Imported global variables are marked as such at TOC generation, and their actual address is resolved at the linking stage in the engine, after all scripts are loaded and imports resolved.
Local variables are the trickiest. For them the TOC records their lifetime scope, using bytecode positions: pos at which the variable is allocated, and pos at which it's removed from the stack. When the engine is resolving these it searches for the vars that are allocated prior to the current execution pos, and not yet removed. Additionally, this search is restricted by the current function's scope (also recorded in bytecode positions).

Remaining questions

Besides the code written in a dirty way, and quickly mashed serialization format for "table of contents" (I will be rewriting this as soon as I get more spare time), the big question I have is whether we want to have this "toc" persistent in the game scripts.

With RTTI the situation was simpler, because RTTI was required to make nested pointers work.

This TOC is so far only for debugging. An alternate use that I might think of is for improving the save system, because knowing a list of variables may let us actually know what we are saving or reading. But this is only an idea.

I see two options here, if we don't want this to be always present in compiled scripts:

Save TOC only if the game is built in Debug mode.
Save TOC in separate files, packed along with the scripts, sort of "debug symbols" for the game.

There's also a question of separating tasks. I wrote this so that engine does most of the work, but that's because it was faster for me to do. I thought that Editor, or another debugger (whoever sends commands), could be resolving variable names to memory instruction as well. But I'm not entirely certain about that now, because there are imported variables which address cannot be known without linking, and local variables with their tricky processing.

Finally, the "watch" GUI may be better, but that's a completely separate problem that may be dealt with on its own.

Backporting to ags 3

If there's a wish to backport this watch feature to ags3, the RTTI generation must also be backported. Any RTTI-based features (in scripting) may be omitted, but RTTI table itself has to be present to be able to access nested fields and recognize types.

ivan-mogilko commented 2 weeks ago

For the reference, I started working on a cleaner branch: https://github.com/ivan-mogilko/ags-refactoring/tree/ags4--memwatch

will open a new PR when it will be at least usable.

ivan-mogilko commented 1 week ago

Closed in favor of a cleaner variant #2430

adventuregamestudio / ags