RPGHacker / asar

(Now) official repository of the SNES assembler Asar, originally created by Alcaro
Other
194 stars 42 forks source link

Feature request: pool keyword #209

Open spannerisms opened 3 years ago

spannerisms commented 3 years ago

This is mainly a feature that I think would be useful for disassembly. The idea is taken from MathOnNapkins who uses this keyword in his disassembly of the US version of ALTTP.

The pool keyword should temporarily enter a new namespace and automatically affix the name of the pool to any sublabel. Upon encountering a top-level label (unless the label uses # to not interfere with hierarchy), the pool should terminate its functions. It should also be terminated by a pool off. Ideally, pools would be stackable as well.

A clonky system of macros could currently accomplish this task to an extent, but a simple keyword to facilitate this functionality would be immensely helpful for disassemblies, as many games include data for routines before the routine itself.


Use case example from my own work:

AncillaSpawn_SwordChargeSparkle_data:
.offset_y
#_1CF921: dw   5,  12,   8,   8

.offset_x
#_1CF929: db   0,   3,   4,   5

.mask_y
#_1CF931: db $00, $00, $07, $07

.mask_x
#_1CF935: db $70, $70, $00, $00

AncillaSpawn_SwordChargeSparkle:
#_1CF939: PHB
#_1CF93A: PHK
#_1CF93B: PLB
#_1CF93C: LDX.b #$09

Here, I'm using a fudged sublabel to do label calls like LDA.w .data_mask_y,Y later in the routine.

Under the proposed functionality, I could refactor to something like this:

pool AncillaSpawn_SwordChargeSparkle
.offset_y
#_1CF921: dw   5,  12,   8,   8

.offset_x
#_1CF929: db   0,   3,   4,   5

.mask_y
#_1CF931: db $00, $00, $07, $07

.mask_x
#_1CF935: db $70, $70, $00, $00

AncillaSpawn_SwordChargeSparkle:
#_1CF939: PHB
#_1CF93A: PHK
#_1CF93B: PLB
#_1CF93C: LDX.b #$09

Allowing me to just use LDA.w .mask_y,Y in the routine. This may not seem like much, but it's incredibly tedious to pick through every fudged .data sublabel after running a find and replace to name unnamed labels. It also makes searching a little easier. Clicking .data_mask_y and searching in many editors will automatically put data_mask_y into the query field. No such label will be found, only references to it. Another small thing, but if it's easily fixed, that'd be really nice.

This basic functionality could be replicated with something like the below, but it's slightly less elegant and not able to handle stacking. It would also demand using top-level labels to create the data labels, which still has the problem of requiring extra tinkering (removing individual periods from the pool would be pretty tedious).

macro pool(name)
    namespace <name>
endmacro

%pool("AncillaSpawn_SwordChargeSparkle")
offset_y:
#_1CF921: dw   5,  12,   8,   8

offset_x:
#_1CF929: db   0,   3,   4,   5

mask_y:
#_1CF931: db $00, $00, $07, $07

mask_x:
#_1CF935: db $70, $70, $00, $00
%pool("off")

AncillaSpawn_SwordChargeSparkle:
#_1CF939: PHB
#_1CF93A: PHK
#_1CF93B: PLB
#_1CF93C: LDX.b #$09

There's also no way to allow functions to share a data table elegantly. A quick example of that is shared data between probes and sparks:

ProbeAndSparkCheckDirXSpeed:
#_05C359: db   1,   1,  -1,  -1,  -1,  -1,   1,   1

ProbeAndSparkCheckDirYSpeed:
#_05C361: db  -1,   1,   1,  -1,  -1,   1,   1,  -1

ProbeAndSparkXSpeed:
#_05C369: db   8,   0,  -8,   0,  -8,   0,   8,   0

ProbeAndSparkYSpeed:
#_05C371: db   0,   8,   0,  -8,   0,   8,   0,  -8

I'm not a big fan of having such verbose label names unless it's necessary. Under the proposed functionality, I could refactor into this:

pool Probe
pool Spark
.check_x
#_05C359: db   1,   1,  -1,  -1,  -1,  -1,   1,   1

.check_y
#_05C361: db  -1,   1,   1,  -1,  -1,   1,   1,  -1

.speed_x
#_05C369: db   8,   0,  -8,   0,  -8,   0,   8,   0

.speed_y
#_05C371: db   0,   8,   0,  -8,   0,   8,   0,  -8

pool off

With the ideal of stackable pools, the above would create 8 different labels: Probe_check_x, Probe_check_y, Probe_speed_x, Probe_speed_y, Spark_check_x, Spark_check_y, Spark_speed_x, Spark_speed_y.

That pool off would also serve a use for this specific example, because immediately following this table is more data that would be useful pooled.

Another option could be comma separated pool names, like pool Probe, Spark. Then, assuming the pool is disabled at new pool definitions as well, the pool off would be less necessary.


I hope this isn't asking for too much, but I know that MoN managed to get it working in his own xkas fork. I've held off on this idea for a while because it's admittedly somewhat niche in the grand scheme of asar, but I'm in sort of a brave mood right now.

Such a feature probably won't see use in the homebrewing or ROM hacking scene, because there, you have freedom to restructure code as you see fit. This is not the case for disassembly though, where you're stuck with whatever structure the original developers used. In my case, that leads to awkwardness that, while not detrimental, is ultimately annoying.

I want my disassembly to be recompilable with asar, and while I can technically achieve that now (ignoring all the missing and typoed labels lying ahead of me), anything I can do to improve the final product's form factor is something worth looking at to me.

And so there's no confusion, those #_AAAAAA: at the start of every line are the ROM addresses of any code or data. I haven't made a request for a feature to replace those, because I honestly quite like my solution. It does what I need it to (show addresses at the beginning of lines) without being intrusive. And #_[A-F\d]{4,6}[a-z]?: is quite an easy pattern to catch for fading so that it's less in-your-face. Plus, once I'm done, I can create an absolutely massive symbols file that will allow me to meticulously check for any address label that assembled to the wrong location. But this is just me rambling.

p4plus2 commented 3 years ago

namespace nested on can enabled stacked namespaces (default is off). Other than that, this request kind of feels like its a duplication namespaces. Maybe we could have namespaces create an implicit label to solve the sub label problem -- but this could be a breaking change.

EDIT: Ah I think I see, you want to be able to have labels get multiple names effectively?

Alcaro commented 3 years ago

This sounds like a rather complicated solution to me. I'd prefer if we create some kind of label-like construction that affects sublabels as if it's a normal label, but doesn't actually create a label, so the real label can be defined later; no need for anything like pool off. For example

_routine:
.sub1:
db 1,2,3

routine:
lda .sub1,x
lda routine_sub1,x
jsr .sub2
jsr routine_sub2

.sub2:
rts

No opinion on what the syntax should be. pool blep or pool blep: sound good to me.

spannerisms commented 3 years ago

I'm worried it would interfere with my address labels if the underscore were used. Would +Label: be a viable solution?

Stackable "pools" (whatever they end up being called) is something I'd really like to see included in a solution, but if that's not really viable right now, I'd be satisfied with something like your alternative. Because really, that's what the pool keyword is doing. Creating a label for sublabels to use without creating the actual label.

p4plus2 commented 3 years ago

Hm, I think I have an idea, give me half an hour and I can push a prototype

Alcaro commented 3 years ago

_blep: is already a perfectly normal label named _blep. I'm not proposing we change the meaning of existing syntax, it's just a placeholder syntax to demonstrate the rest of the proposal.

Obviously it should be stackable, but that wouldn't need any extra effort beyond the base feature, it'd just look like

phantom foo:
phantom .bar:
..baz:
db 4,5,6
.bar:
db 7,8,9

foo:
lda foo_bar_baz,x
lda foo_bar,x
jmp foo

I'd prefer if we use a keyword, not a prefix character. We already have too many label prefixes, it's already hard to keep them apart.

spannerisms commented 3 years ago

Sorry, I got confused! I thought you were proposing _ at the start of the label be used to create the phantom label. I'm not particularly attached to the word pool. Any word is fine and thanks for looking into this.

p4plus2 commented 3 years ago

Current implementation subject to much change:

pool test test2
.asd
.zxc
pool off
; Defines test_asd test2_asd test_zxc test2_zxc
spannerisms commented 3 years ago

That's awesome! Thanks.

p4plus2 commented 3 years ago

Hm, I wanted to make non-sublabels used in pools generate an error. But this would make your label addresses go nuclear. Right now non-sublabels are ignored but I am not entirely convinced that is ideal behavior. I also wonder if I should make pool off its own command so that off isn't effectively a special case. (especially since off could be short for offset). At this time stacking it doesn't seem like an entirely great idea as this would have some nasty explosions in memory usage if somebody nested too many layers I think (though some hacky tricks to optimize this could exist...). Plus then we have to figure out good was to turn off stacks one layer at a time.

spannerisms commented 3 years ago

Would it be possible to make #Label type labels not generate an error? I can see potential for use cases where someone might want a generic top-level label used within a pool without breaking hierarchy.