[meta/future] Process Improvements for the Future

There are a lot of things that we're improved along the way, but as we get close to the end of the decompilation aspect -- there's less incentive to make it better. But we'll want to take a step back and improve some issues before moving onto the next games.

This is just a place to capture those things we'd like to improve.

Project Structure

[x] Make it easy to handle multiple games / multiple versions of the same game

Decompilation

[x] Detect and output warnings about common problems
- Method and function call splits
- Missing label definitions
- Missing res-tag casts
[ ] Output useful decompilation results in a machine readable format.
- I've made a lot of scripts that are helpful, but do very gross things to work. A big part of this is because most of the information has to be gathered by brittle text searching. If we detect and output our errors in a nice format, then anyone can make scripts in a language agnostic way.
[x] Handle more automatically.
- Vector / Event-Message-Block / art-joint-anim usage automatically
[ ] More intelligent error/warning detection
- Method/Function args obviously wrong (being cast to things that make no sense)
- Stack argument accessed in ways that don't make sense / is it too big given the context of the function?
[ ] Deal with annoying far labels
- floats are probably the most annoying remaining part of label casts
[x] Make the decompilation process more interactive
- it'd be nice to have a repl of sorts for decompiling a file. Currently it requires keeping a lot of files open / doing lots of manual and repetitive searches. I feel like a good amount of this can be done automatically, with the information presented to the user. The user then makes a decision and moves on -- of course, you're always going to have some situations where you have to do things the old way.

OpenGOAL src handling

[x] Better updating scripts
- Right now I've made a very naive updating script that just dumps the entire file into the goal_src equivalent
- This is fine, until we start cleaning up that code / commenting it / etc. Then it becomes unusable. There is going to be a ceiling to how good this can be made, but perhaps we could limit it to updating certain functions, etc.
[x] Automatic src updating
- Going back to the tools outputting things in a machine readable format -- it'd be nice if the compilation process would output which file it failed in, the updating script could use that as input to iteratively keep updating files until it builds -- I'm not a fan of blindly mass updating everything. This might be something I do sooner than later.
[x] Faster builds / cache unchanged files
- I think we already have an issue for this, but in general it wasn't so bad when we had 100/200 files. Now that we have 400 files, it can get a bit annoying when you are updating 100s of files.
[ ] Detect duplicate code segments -- this is usually indicative of a macro, and we could implement them to make the code cleaner.

This is a long reply, but hopefully it can get us started thinking about this. We probably have a lot of cleanup left in Jak 1, and like 1 million lines of Jak 2 to do in the future, so process improvements are definitely worth thinking about. If you didn't notice I got kind of burnt out working on the decompiler and a huge number of features are currently unimplemented or were never done right. It was never really planned to be a "full decompiler" that tried to make nice source code, but slowly evolved to this without much design. At the bottom I've put together a list of what I think are absolute requirements for a future version.

We already have some support for multiple versions with the decompiler config system. I think it is enough for totally separate games where you don't want to share anything between them, but it doesn't work that well for different versions of games that are only a tiny bit different. I'm not sure of an easy way to solve this. It also becomes a nightmare to test because all the different versions have tiny differences (jak 1 demo has the opposite endianness for VAG files, jak 2's demo has different block sizes for compressed DGOs, jak 3's demo has a different v5 link format...). In total there's >20 versions of all the games, and I've had people ask about extracting textures/models/audio from all of them.

For the common problems, there are a lot of checks I think we could implement that could catch many of them. Right now there's nothing that checks that your stack layout or label types are actually free of overlaps. Realistically some of these are very hard to reliably detect and we'll probably never do it perfectly. If we were okay with having some false positives, we could try to detect common stuff like the res-tag cast and missing args. I also think that a cleaner output would make it much more obvious when something like this happens. Ideally, casts are rare enough that whenever we see one, we look closely and understand why.

I think the decompiler should output some JSON that gets read by a GUI. So this could help solve the text parsing issue. Your scripts have been super useful, but I agree it would be nicer if we didn't have so much text-based replacement. An interactive decompiler would be cool but a lot of work. If you had a GUI that looked at the JSON, you could make it "interactive" by rerunning the decompiler whenever the user changes a setting and clicks a re-run button.

The code updating is a tricky problem. All your ideas are good, but I don't think any of them will completely solve the problem. It might be better if we try to finalize files more before moving on to the next one. I'm not a fan of the giant PRs with 200 files changed and 10,000's of lines of renaming changes. We got through the first ~1/3 of the game being able to do this, but it became harder once we got to the later code. Large refactorings are going to be annoying and introduce random bugs, no matter what we do.

Compilation speed is a tricky one. The current build system ((mi)) is able to avoid rebuilding things if it knows that none of its dependencies have changed. But, we don't know that for GOAL code and we just tell it that every file depends on every single file before it in the build order. I guess you are still using (build-game) because not everything is build with (mi) yet, so we should fix that. It seems possible to detect the dependencies for things like functions/macros, but it's harder when you have macros that set global GOOS variables. This is something I'm really curious about in the original GOAL. How did they handle this?

Decompiler Requirements (in no order)

pretty printer v2. I don't think we need it to be perfect, but we should at least get the formatting for defun to not split across multiple lines. Also longer than 80 character lines.
type propagation with back-prop. I got like 75% of the way there once but gave up. This would reduce the number of silly casts by a lot.
"click to rename" variables, like ghidra
"click to add cast", like ghidra
handle far labels in a way that works
some sort of pattern matching for macro insertion. Maybe we could apply this on the s-expressions directly. This would make it easier to write a pattern matcher (the current GenericMatcher stuff is confusing), but we'd lose a lot of context. An s-expression matcher is currently a bit tricky to do because the decompiler sometimes cheats in its s-expression generation and will output a single symbol like :foo bar in order to work around the pretty printer being terrible.

Yeah I agree with all you've said, and a few of your points i was thinking of originally but forgot half-way through writing mine.

To respond to a few things directly:

I'm personally ok with some false positives, a lot of the things I've done in my hacky way have false positives but they aren't that bad, or are easily dismissable. But we will need to be careful not to add things that become annoying.
An interactive decompiler would be a lot of work, depending on how far you take it. My vision is something that just nicely combines everything we already use in an central location. In that sense, we could likely start small and extend it overtime, as we've already been doing.
Yeah, dealing with a lot of the versions is a pain -- that was what i was mostly thinking, how we have to currently duplicate so much just to support a very small diff. Maybe this is something that can only be cleaned up afterwords, or isn't worth worrying about at all.

I feel like a lot of the workflow tooling could be solved with something like a VSCode extension Potential workflow improvements when decompiling:

first-class syntax support, highlighting and code outline
- https://code.visualstudio.com/api/references/vscode-api#DocumentSymbol
monitor file changes and recompile without needing a python script
type/field search
method/function search
commands to:
- go to next error
- recompile file
- update reference tests
- update gsrc
1 click to disable a function that isn't working yet
1 click to add a function to ref tests skipping

Potential improvements for OpenGOAL in general:

syntax support
formatting using our pretty printer

IR Output Ideas:

either with the extension or natively in the IR itself, be more clear about when casts are effecting the output.

As a tangent, the update-gsrc flow needs to be more resilient and useful (might do this asap as its annoying):

if a file originally has custom comments in it, skip it
functions that we have marked as skipped in ref tests, should automatically be removed from gsrc

We should experiment with reducing the burden of merge conflicts as well.

try using YAML instead of JSON
if this still causes conflicts when making simple additive changes, perhaps we could merge multiple files together so you can add your casts independently.

We'll want https://github.com/open-goal/jak-project/issues/497 for the future as well. Methods were far more annoying than functions to deal with.

A more intelligent type analysis is a must IMO. Our casts file is seven thousand lines long, the stack structures file is five thousand and the var_names is >4000. Linearly scaling this and that's like over 30k lines of random configuration for Jak 2. This is WAAAAY too much work and really difficult to manage. An S-expression macro matcher would also be extremely useful, specially if we could summon it for specific contexts only to avoid false positives. The GenericMatcher is fine if we want a little more context, I think we can reasonably make it just work a little better/add more features.

On the topic of type analysis, a way to the tell decompiler that we don't care about the specific type of a function, or that it can be "anything", would be extremely useful. This is mainly an issue for the event handlers, which there are already a lot of in Jak 1, which end up looking... terrible. There's basically no instance where the output doesn't desperately need some clean-up, which we can't do because it would be overwritten in case we wanted to re-decompile the file or refactor the output, so we have to deal with it. Their code certainly never looked like this, for example:

  (let ((v1-0 arg2))
    (the-as int (cond
                  ((= v1-0 'untrigger)
                   (the-as int (when (= (-> arg0 type) light-eco-child)
                                 (let* ((a1-3 (-> arg3 param 0))
                                        (v0-0 (logxor (-> self angle-mask) (ash 1 a1-3)))
                                        )
                                   (set! (-> self angle-mask) v0-0)
                                   v0-0
                                   )
                                 )
                           )
                   )
                  ((= v1-0 'trigger)
                   (the-as int (when (not (-> self player-got-eco?))
                                 (set! (-> self player-got-eco?) #t)
                                 (let ((a1-5 (new 'stack-no-clear 'event-message-block)))
                                   (set! (-> a1-5 from) self)
                                   (set! (-> a1-5 num-params) 0)
                                   (set! (-> a1-5 message) 'white-eco-picked-up)
                                   (the-as int (send-event-function (ppointer->process (-> self parent)) a1-5))
                                   )
                                 )
                           )
                   )
                  ((= v1-0 'beam-off)
                   (the-as int (go light-eco-mother-discipate))
                   )
                  )
            )
    )

Plus, manually cleaning up code just means another way that we can make mistakes.

I think reducing the amount of casts, macros and other kinds of cleanup that we have to do for the decompiler would pay off in the long run the most.

Edit: If we could at least fix #578 and #852 it would be very helpful. The former straight up causes the decompiler to break completely (load-game-text-info) and the latter partially causes the thing above.

These are all good points.

I agree 100% that we need to improve the process of adding casts. We should have way fewer of them, and it should be easier to figure out. I was never really happy with how this worked out for jak 1 level code, but I was too distracted with trying to get a head start on graphics, so we wouldn't reach the end and just be stuck waiting on all the renderers (yet here we are...) You could imagine the decompiler spits out a line like: didn't know the type of a1 here, input a type and then you can just type in the type name. I don't know if this is a good idea exactly, but you get the point - it can be easier. The current workflow was designed around just decompiling a few files where the biggest concern was accuracy over convenience and readability. Which is something I think we did really well with. The majority of files worked perfectly and we never had to do some huge "oops decompiler bug, now we have to update 100 files"

For the example you gave, I'd argue that setting the type to object should just do this, and all the weird the-as ints and the v0-0 following (set! (-> self angle-mask) v0-0) is just stuff that's not quite right with how it inserts casts (why is it trying to make that block integer anyway..). Some of it was probably an overreaction on my part to a few bugs we had early on with uint128s returning on only some cases of a cond in res.gc (didn't end up mattering... the game never uses this) and with the return value of a event handler being lost in actor-link.

The unfortunate thing is that a lot of these fixes (better type analysis, s-expression matcher, cleaning up the event handler example you showed) would require some pretty large parts of the decompiler to be rewritten. There's a lot I got wrong in the design and I know I could do better with what I know now. That said, the decompiler and type system is about 69k lines of C++ and there are some parts that were extremely hard to get right. I'm not completely opposed to doing any of these improvements, but it's going to take months to do. I had a lot more free time to work on this project in 2020/early 2021 than I do now.

None of this is a decision we have to make now, but we should think about it.

open-goal / jak-project