UsernameFodder / pmdsky-debug

Debug info for reverse engineering PMD: Explorers of Sky
GNU General Public License v3.0
38 stars 20 forks source link

Support symbol name aliases #238

Closed UsernameFodder closed 8 months ago

UsernameFodder commented 8 months ago

Is your feature request related to a problem? Please describe. Various projects rely on pmdsky-debug symbols, including https://github.com/SkyTemple/pmdsky-debug-py and https://github.com/SkyTemple/c-of-time. When symbols are renamed, it can cause compatibility issues downstream.

Describe the solution you'd like Add support for an optional aliases field in the YAML files, like this:

- name: SomeFunction
  aliases:
    - SomeFunctionAlias1
    - SomeFunctionAlias2

This will make things symbols searchable by more names (including old ones), while making it possible for tooling to decide whether or not to use the extra alias names.

Aliases should not be included in the header files, since this would lead to a lot of duplication. Instead, they should be auto-generated for projects that need them (like c-of-time). The headers-with-docstrings artifact should include these generated aliases.

The following tooling should also support aliases:

Describe alternatives you've considered

Additional context This has been a long-running shortcoming of pmdsky-debug. It's probably about time to try to do something about it.

UsernameFodder commented 8 months ago

RFC: @theCapypara: Would this allow us to solve stability issues with SkyTemple? If we update pmdsky-debug-py to include alias names in its mapping, I think as long as pmdsky-debug always keeps old names as aliases, SkyTemple should never break? @tech-ticks: I think this will require linker script updates in c-of-time to map the aliases properly? Is it possible to map multiple names to the same address? Assuming that works, I think if we generate the aliases in the headers, things should just work? @End45: Any general thoughts? Does this seems reasonable? Regarding the issues you raised around No$GBA in https://github.com/UsernameFodder/pmdsky-debug/pull/236#pullrequestreview-1799557116, if we kept the old names as the primary ones, but added the new ones as aliases, I think that should make things better? @AnonymousRandomPerson: I think it would make sense to update the sync-to-pmdsky-debug script to add aliases instead of renaming symbols in the event of a conflict? Is there anything else we ought to change?

AnonymousRandomPerson commented 8 months ago

As long as the original symbol name is adequate at describing the symbol, is it worth the added complexity of these aliases to add a subjectively better name? I feel it would cause less confusion to have a single consistent name for symbols if we can.

The strict requirement to alias the previous name creates instances where an undesirable old name will be forced to stick around permanently, such as with temporary symbol names that need clarification (e.g., ARM9_UNKNOWN_TABLE__NA_2097FF8) or with symbols that we discover are incorrectly labeled. Perhaps a missing alias can be surfaced as a warning without necessarily blocking the PR.

If the goal is to improve searchability, a lower complexity solution could be adding additional keywords to the description field. This could mean aliases or just other words that people commonly try and search for.

UsernameFodder commented 8 months ago

The strict requirement to alias the previous name creates instances where an undesirable old name will be forced to stick around permanently, such as with temporary symbol names that need clarification (e.g., ARM9_UNKNOWN_TABLE__NA_2097FF8) or with symbols that we discover are incorrectly labeled. Perhaps a missing alias can be surfaced as a warning without necessarily blocking the PR.

That's true, this should probably be a non-mandatory check.

As long as the original symbol name is adequate at describing the symbol, is it worth the added complexity of these aliases to add a subjectively better name? I feel it would cause less confusion to have a single consistent name for symbols if we can.

I think this would also address the issues around apps that depend on specific names. Having aliases means we wouldn't be forced to choose between keeping an old name around forever, or updating an old name and breaking the downstream app, provided it knows how to use aliases. The searchability part could be solved with the description field, but that doesn't solve the issues with SkyTemple and c-of-time.

Frostbyte0x70 commented 8 months ago

This seems like a good compromise to me. It allows linking the symbols in this repo to the ones in the decomp, even if they use a different naming scheme there. Meanwhile we can keep using the simplified names, which avoids the problems that have already been explained.

theCapypara commented 8 months ago

@UsernameFodder Thanks, this sounds like a good approach!

I would propose to solve the ambiguity issue that the YAML contains comments explaining when aliases are expected to be removed, and remove any deprecated alias after a year or so. That way apps have ways and time to adapt but they are eventually forced to use the one canonical name.

From my standpoint more ideally the date could even be part of the YAML data itself instead of just a comment. This way I could auto generate warnings for SkyTemple that would be raised when people try to apply a patch using deprecated symbols.

re:

@AnonymousRandomPerson: As long as the original symbol name is adequate at describing the symbol, is it worth the added complexity of these aliases to add a subjectively better name? I feel it would cause less confusion to have a single consistent name for symbols if we can.

True, but as said existing code already uses the old symbols and it's not just SkyTemple itself, also third party patches and ROM Hack projects and setups that researchers already use. I think my solution above would solve this.

@AnonymousRandomPerson: The strict requirement to alias the previous name creates instances where an undesirable old name will be forced to stick around permanently, such as with temporary symbol names that need clarification (e.g., ARM9_UNKNOWN_TABLE__NA_2097FF8) or with symbols that we discover are incorrectly labeled. Perhaps a missing alias can be surfaced as a warning without necessarily blocking the PR.

I also think that making it a case-by-case judgement if the old alias is the same as the previous one is a good idea and maybe it's enough. But we should really only do this for temporary unknown aliases that are likely not used by patches or SkyTemple etc.

UsernameFodder commented 8 months ago

I would propose to solve the ambiguity issue that the YAML contains comments explaining when aliases are expected to be removed, and remove any deprecated alias after a year or so. That way apps have ways and time to adapt but they are eventually forced to use the one canonical name. From my standpoint more ideally the date could even be part of the YAML data itself instead of just a comment. This way I could auto generate warnings for SkyTemple that would be raised when people try to apply a patch using deprecated symbols.

I unfortunately haven't figured out a good way to support YAML comments in resymgen. We could just add a note to the description?

I actually wouldn't mind keeping aliases around forever if people use them. I don't expect them to add much overhead, so deprecation doesn't seem all that necessary? It seems more beneficial to keep old names around for posterity. If you want to add warnings, you could probably just warn whenever someone is using any alias instead of the primary name? Then if we really want to remove an alias, we can just wait a while and then drop it.

tech-ticks commented 8 months ago

I think this will require linker script updates in c-of-time to map the aliases properly? Is it possible to map multiple names to the same address? Assuming that works, I think if we generate the aliases in the headers, things should just work?

Yes, generate_linkerscript.py needs to be updated, but everything else should work without changes.