RetroAchievements / rcheevos

Library to parse and evaluate achievements and leaderboards for RetroAchievements
MIT License
86 stars 33 forks source link

[Feature Request] New runtime rcheevos interpreter (Python or custom) #344

Open Souzooka opened 3 months ago

Souzooka commented 3 months ago

Hello, while this is mostly just an idea at this point I wanted to start a public discussion on this topic. Currently, of course, achievements/leaderboards/rich presence only support being coded in the custom rcheevos syntax. rcheevos, as it stands, struggles with many titles on more modern consoles and unfortunately its problems are fundamental and can only really be mitigated at most by future extensions. Alternative means of coding assets parallel to rcheevos will allow developers more freedom in developing not only for titles hostile towards rcheevos, but for developing more expressive and better achievements. These means would be implemented in a similar fashion to how FFXI's achievements are, where there is dummy rcheevos logic which is always false, but an alternate mechanism unlocks the achievement (in this case, the toolkit portion of a Python API).

Here are a list of problems with the current solution I believe that a Python-based (or other, alternative) solution may be able to completely (or conditionally, as discussed later) solve:

Problems (rcheevos):

Poorly supported arithmetic

Basic arithmetic and bitwise operations are provided. Thus the developer can calculate the result of byte(1) + byte(2), etc. Issues arise when the developer wishes to do a more complex operation which requires intermediate values. For example, (byte(1) + byte(2)) / (byte(3) + byte(4)) is impossible to represent in rcheevos. As there is one accumulator, the result of byte(3) + byte(4) cannot be remembered to then be used to divide byte(1) + byte(2) by. This is addressed somewhat by TheMysticalOne's recent pull request to support 2 accumulators, though it's difficult to say if that would completely solve that issue. Other arithmetic tools, such as sqrt, pow, sin, cos, tan, along with rounding methods, are also not available to the programmer but they may prove beneficial to some which wish to employ 2d or 3d geometry in their achievements (some developers already use 3D bounding boxes, which along with some planes are the only shapes which can be used at the moment).

Poorly supported pointer arithmetic / beginner-friendly memory map hinders intermediate and above developers

Similar to the above section -- more than 1 address cannot be used to construct a pointer; this also means index-based pointer math is also impossible if a pointer has already been followed at least 1 level deep. Remember may help alleviate this issue. However, RA also employs a custom virtual memory mapping for games which while it works great for beginner developers it quickly becomes a hinderance when pointers are necessary. When reading a pointer, it's necessary to transform it to conform to RA's existing memory map. This has several issues:

  1. A specific pointer may point to one of several different memory regions. While not impossible to work around, handling the pointer for each memory region it may point to is necessary within achievement logic.
  2. Memory mappings are woefully documented. How exactly someone should transform a read pointer to conform to RA's memory mapping is largely passed around via word of mouth (e.g. someone saying you should use & 0x1fffff for pointers on PS1). If you didn't know how you should transform a pointer to make it conform to RA's memory, how would you find out? The page on Add Address (https://docs.retroachievements.org/AddAddress-Flag/) barely discusses this necessity - it utilizes a bitwise & in an example, but doesn't particularly draw a clear distinction between what separates the value of the pointer and the RetroAchievements address, especially for each platform. Even if you conducted your own investigation and possibly ended up at https://github.com/libretro/RetroArch/blob/fcd546b74c0a97d4027667237b8b9c29aa3aa6ab/deps/rcheevos/src/rcheevos/consoleinfo.c#L263 -- which lists the regions of each console which is explicitly defined, you still wouldn't know how the memory is mapped out for many consoles which aren't explicitly defined (and may actually then go on to break existing Add Address flags if a region is ever explicitly defined for them). For example, how should a pointer be transformed in order to be utilized for the Sega NAOMI 2 Arcade board?
  3. Developers are only able to access regions which are explicitly defined and added to the RA memory map. In both GBA games I've worked on (Smashing Drive and Stuntman), being able to follow a pointer to ROM and evaluate the data at that address would have helped to simplify the existing logic, but the ROM is currently completely inaccessible.
  4. The ability to even transform existing game pointers to match RA's memory map is only possible due to each memory having a clearly defined constant starting address and length in virtual memory. This particularly concerns me that Add Address may be impossible or exceedingly difficult to use for a potential DOS rollout. If in virtual memory the starts of these regions shifted around, it would be incredibly difficult to map them to conform them to RA's memory mapping (see the "Memory Regions" section here for the proposed DOS layout: https://github.com/RetroAchievements/rcheevos/pull/304).

Lack of control flow / guaranteed halting

rcheevos allows for no runtime logic, instead requiring that all possible scenarios are already accounted for and evaluated inside the achievement logic. This is probably the most fundamental flaw of the design listed, as this makes many data structures such as hashtables or linked lists impossible to access (or very difficult, with some extreme workarounds). Titles which utilize some sort of higher-level script interpreting will have that data difficult to access and the number of titles which utilize higher-level languages will only increase as more consoles are supported. DOS titles are also probably more likely to not have rewritten C++'s STL like console developers have, so they would probably be more inclined to use structures like std::list or std::map.

However, rcheevos' current design means that it is guaranteed to halt. This is a boon because even if an asset may be incredibly complicated and long it is guaranteed to complete every update.

Lack of variables

A simple one -- hits replace variables, and have a very limited use. Allowing data to be saved for later in the program's lifetime would allow for better assets. A very simple and not very inconsequential example is that for my Air Ranger set (https://retroachievements.org/game/27495) the leaderboards submit their scores earlier than I would like. Instead of submitting on the mission result screen, they submit when the mission completes. I could find a means to determine the player is on a certain part of the results screen pretty easily, but by that time the value I'd like to use would be gone. When the mission completes, the game converts the in-game frame timer used for the mission into seconds and overwrites the frame timer with that. A few frames later, the game unconditionally increments the second count by 1, also wiping out the prior value. This means I had to compromise by submitting the leaderboard early at what was unfortunately the last possible time in order to keep the frame precision on player times.

Performance implications

This probably deserves a bit more analysis and admittedly I'm going to introduce some assumptions. rcheevos is a lean solution but its performance is betrayed by 2 major issues:

  1. The need for extreme workarounds to navigate some of its issues. Due to the lack of runtime logic support, all possibilities for memory that indicates an achievement has been satisfied must be evaluated each update. For example, https://retroachievements.org/achievement/262250 evaluates 37 linked list nodes in 37 factorial comparisons instead of the 37 comparisons another solution would need.
  2. Since all assets operate independently, a lot of unnecessary duplicate code has to be run. An obvious example is something like level-based progression for an older game -- you may have 10 achievements with exactly the same logic, but are differentiated by a level ID that is checked. Of course, in some instances there could be many more than 10 duplicates. One of my sets, Firefighter F.D.18 (https://retroachievements.org/game/27515) has 360 leaderboards which are constantly running and are all independently evaluated. The only difference between them are 3 values: the stage ID, difficulty, and character ID. A smarter solution (such as a Python script) could simply check if a level has been completed once, and then determine which of 360 leaderboards should be submitted using those 3 values instead of checking 360 times. Other sets, such as rhythm games, may yet be more egregious still with some rhythm games having over 1000 leaderboards.

Difficulty to maintain assets created with rcheevos

Originally, as indicated by scott, rcheevos basically just began as a way to check if a memory value was a specific value at a certain time -- for example, to check if Sonic had a ring count of 100 in Sonic the Hedgehog. Since then rcheevos has been extended as needed and has become a house of cards of difficult-to-understand nuances. Combine that with other issues such as logic only showing addresses (i.e. not showing any sort of variable name -- instead notes have to be constantly referenced back and forth), games becoming more complicated, some developers using RATools and some employing the workarounds mentioned previously it becomes nearly impossible to tell what some logic is doing when viewing it on the website.

rcheevos is specialized knowledge

This ties in with the above point -- people have to learn the rcheevos toolkit. They can't use the rcheevos toolkit anywhere else, and they probably won't have any reason whatsoever to learn it before applying to become a developer. Using Python, Lua, or really any other common solution would give new developers programming skills which could be used elsewhere and also potentially speed up the junior developer pipeline for new developers which already know those languages.

Where do we go from here?

Before discussing potential solutions to the above problems, I want to outline a few requirements for a new solution (as written up by Biendeo (https://github.com/Biendeo)):

There are two (and a half, I guess, if you count Lua or another language) solutions I'd like to propose which will address the above requirements and many of the above problems: a Python-based toolkit, and a toolkit based on a custom specialized scripting language (such as perhaps an extension of the already existing RAScript, although perhaps with some changes so that achievements can be arbitrarily unlocked to address point 2 in Performance Implications).

Problems (Either solution)

Developer culture

RetroAchievements developers aren't taught to be scripters -- they're taught, primarily, to utilize the toolkit in the provided editor in order to make achievement sets. This isn't meant to be disparaging, this is just the reality of how developers are taught at the moment. Some developers may not be able to adapt, or refuse to adapt and simply stop developing for a different toolkit. This is why I think that a new solution should run in a similar fashion to FFXI's standalone, so that the old and new toolkits exist in parallel, at least for the time being.

Emulator integration

Not only would emulators have to integrate a new toolkit, if it ran in parallel to the old toolkit they would have to integrate and maintain both means of developing achievements -- this will likely prove a difficult issue to tackle but for the time being I would propose using PCSX2 as a testing ground for a new toolkit. PCSX2 is already standalone and faces probably the greatest incidence of issues which may be solved with a new toolkit.

Runtime logic may not halt

This is a big disadvantage a new toolkit with runtime logic capabilities would have over the existing toolkit: it may never stop running, or even just run for too long. Research would need to be conducted into how to best handle this, and while Python is pretty fast a custom RAScript-interpreter may offer more control over handling the performance of scripts.

Runtime logic may halt too soon

The other side of the coin -- how should runtime errors be handled? A custom RAScript-like language could try to avoid runtime errors at any cost, but that also runs the risk of causing hard-to-debug unexpected behaviors.

Problems (Python)

Requires heavy Python installation

Lua may be a more lightweight solution here.

Newer versions require newer versions of Windows

Difficulty of sandboxing/security issues

Arbitrary Python code can of course easily damage one's PC, so using an already existing language such as Lua or Python would require that scripts run in a sandboxed environment and ideally with no access to network or file I/O.

Conclusion

A new toolkit and its exact implementation is something Biendeo (https://github.com/Biendeo) and I (and hopefully other developers here) would like to investigate, but it is currently mostly in a brainstorming phrase. I wanted to open up a discussion here to hopefully identify some potential pitfalls and perhaps discuss finer details.

Hexadigital commented 3 months ago

One other thing worth taking into consideration are non-Windows platforms, specifically those running on retro handhelds (Anbernic, Miyoo, Retroid), mobile devices (Android, iOS) or gaming consoles (Playstation Vita, Switch, Xbox Series).

As RetroAchievements currently has a wide variety of users, it would be good to consider the feasibility and overhead of different interpreter options on those platforms.

hrydgard commented 3 months ago

Like Hexadigital says, runtime overhead on small devices needs to be considered. Running hundreds of little Python scripts per frame is just not going to work, Python is very, very slow and eats a lot of memory. Linking in a full Python interpreter is also very, very unappealing. Lua is about the heaviest thing I'd consider (PPSSPP).

Souzooka commented 3 months ago

To clarify -- the current idea is just 1 unified script which can manage all assets associated with a game, though it's true that Python has a lot of overhead.

Souzooka commented 3 months ago

Python itself is a lot to ship with emulators, and is probably overkill for this problem. While it would be nice, I think I agree at this point that Lua (or another custom smaller language for managing assets) would be the way to go.