Absolute addressing for zero page variables

freem commented 5 years ago

This has been discussed before, but it's being brought up again because @morskoyzmey has proposed a solution...

I'm not sure if this is the best way of doing it, but I don't have any alternate solutions at the moment.

Xkeeper0 commented 5 years ago

Someone in the thread mentioned using syntax like LDA !$89, which might work better.

I actually ran into this issue in a few places with SMB2's disassembly, which has a few locations that use the LDA $00XX addressing. 😑

freem commented 5 years ago

I've run into this every time I've wanted to do a disassembly and have to end up manually specifying the hex values. IMO not fun... dunno if ! syntax is used in any other assembler (koitsu mentions using > before a value, though I'm not sure if that conflicts with existing code...)

morskoyzmey commented 5 years ago

Is there a case, in which $00XX syntax should be considered as ZP addressing? If not, then my solution just solves that part of problem.

Xkeeper0 commented 5 years ago

It doesn't support a case like this:

ZeroPageLocation:
   .dsb 1  ; $7F

...
SomeCode:
   ;LDA $007F
   LDA ZeroPageLocation ; No way to indicate this should not be a ZP read
...
LaterCode:
   ; LDA $7F
   LDA ZeroPageLocation ; ...while this one should be a ZP read

Using a ! as a prefix would allow for using labels or addresses.

morskoyzmey commented 5 years ago

Yes, such syntax would be nice to have, but does it exclude mine solution? I did it just to handle $00XX syntax of clever-disasm, which looks pretty intuitive to me. General solution needs more time :)

morskoyzmey commented 5 years ago

Well, it wasn't hard to add, except that "!" symbol is in use already (for expressions). I wanted to use @, but it's for local labels, so I went with star symbol (*). Works as intended.

diff

Test file:

.base $0000

ldx $0011   ; just for my case
ldy *$12,x
stx *$10+3
lda *%10100
and *Var1
cpx *Var2
sta *Var2+1,x

Var1 dsb 1, $18
Var2 dsb 1, $19

Outputs:

Any suggestions?

koitsu commented 5 years ago

This has been open for a while but I thought I'd chime in, though I think this is now the 3rd or 4th time I have chimed in on this particular subject (assembler syntaxes) in the course of my life. Apologies for its length, but I cover a lot of ground and history here.

Asterisk (*) is really not a good choice, IMO; I will fight strongly against it, as it leads to confusion and is historically abnormal.

I am not against using the a: prefix from ca65 (see below) but the underlying asm6 code uses a switch/case condition comparing against a single char/byte, not something like strncasecmp().

That said: look at how ( and ) are handled in the same code: the same could be accomplished for a:. Doing two byte comparisons is not going to hurt performance-wise, though I do worry about how one would handle a (with no succeeding colon), e.g. a = $fc / ldy a. I'd need to sit down and read the parser code slowly to find out if this could lead to problems (buffer overflows, crashes, or misbehaviours). (Side note: sing a as a variable name is bad practise anyway (some assemblers treat a to always refer to the accumulator itself, e,g. inc a == inc). Possibly a kludge would be to reject a as a permitted label/macro name? All assemblers have special/reserved words/symbols.)

I do maintain that the specifier should be somewhere in the operand portion, not the opcode, as most 65xx coders have a very strong preference when it comes to keeping the "opcode column" limited to 3 bytes/characters (excluding whitespace/tabs).

Previously mentioned was the subject of what past or other assemblers do about this. I keep a public Assembler Manuals directory on Dropbox for this exact reason.

Many of these are 65816 assemblers (I spent more time on 65816 than I did 6502), which does bear relevance when selecting 16-bit registers at run-time, a subject I don't want to get into here because it would just confuse people. I've done my best to omit those details and focus just on the 6502/8-bit behaviour.

For some of the below assemblers, my descriptions may be inaccurate (ex. the solution might apply to expressions and not just operands or literals), so cut me a bit of slack.

All below examples should assemble to bytecode ac fc 00 (i.e. LDY absolute; opcode $ac, operand $00fc, 4 cycles).

65816 assembler by K.P.Trofatter: w suffix on literals (possibly expressions?) and colons are ignored, e.g. ldy $fcw or ldy $fc:w
65816 assembler by Jeremy Gordon: :w suffix on expressions, e.g. ldy $fc:w
acme v0.96.4 (AddrModes.txt and/or 65816.txt): use of the "postfix method" of opcode+2, e.g. ldy+2 $fc (otherwise assembler bases its decisions on literal operand string length, citing "good structured programming" :roll_eyes:)
ca65 v2.17: use of the a: prefix on the operand, e.g. ldy a:$fc
dasm v2.16 by Matt Dillon (search for "FORCE extensions"): .w, .e, or .a suffixes on opcode, e.g. ldy.w $fc, ldy.e $fc, or ldy.a $fc
fasm v1.0 by Toshi Morita: unable to determine from documentation
ORCA/M v2.0 (Apple IIGS) (pages 309-310): ! or | prefix on operand, e.g. ldy !$fc or ldy |$fc
MagicKit v2.51 (both NES and PCEngine): defaults to absolute addressing, programmer has to manually specify when to use ZP via (example) ldy <$ab syntax
Merlin 8/16 (Apple IIGS): any character after the opcode except for D or L; colon (:) was most common, e.g. ldy: $fc or ldyQ $fc
Merlin32 v1.0: : suffix on opcode, e.g. ldy: $fc
nescom v1.2.0: ! prefix on the expression, e.g. ldy !$fc
NESHLA 20051417: unable to determine from documentation
Ophis Assembler v2.1 by Martin C. Martin: unable to determine from documentation
P65 v0.7.2 by Martin C. Martin: unable to determine from documentation
snasm v1.0: .w suffix on opcode, e.g. ldy.w $fc
snesasm v1.0 beta by The Inquisition: vague. Implies absolute addressing is used by default, but doesn't offer a way to force 8-bit/ZP. Highly suspect; would need actual testing
trasm v1.11 by Norman Yen: .w suffix on opcode, e.g. ldy.w $fc
WDC Assembler v3.49 (pages 22-24): ! prefix on expression or value, e.g. ldy !$fc
WLA DX (asmsyntax.rst, asmdiv.rst, codetoknow.rst): .w suffix on the operand, e.g. ldy $fc.w. May also be possible to use on the opcode, e.g. ldy.w $fc, but the documentation implies this may be specific to .65816 mode
x816 v1.12f by Norman Yen: ! prefix on the symbol e.g. ldy !$fc. Might also work with .w on the operand, e.g. ldy $fc.w, but unsure (if so: probably to be semi-compatible with WLA DX)
xa65 v2.3.0: ! prefix on the expression, e.g. ldy !$fc

I hope this demonstrates the large amounts of variance between all the assemblers, and why scrutinising introduction of an uncommon character is justified (rephrased: let us learn from the past 40+ years of 65xx and associated tools).

Edit: Adding "prefix" on some examples to clarify, and info on MagicKit (thanks Memblers!)

morskoyzmey commented 5 years ago

Thanks for the detailed tour. I've done with a:, as you suggested.

diff

koitsu commented 5 years ago

@morskoyzmey Thanks, this looks good. Though I think https://github.com/morskoyzmey/asm6/compare/ed63c803e3302c0858af807f4b8dc5b37382dab4...master#diff-a3546deb9faf2991939c54aa42a47272R1912 indicates that this is in addition to the change of supporting automatic size determination based on length of string? If so, what happens if you do something like qqqq = $fc / lda qqqq (note qqqq label is 4 characters long)?

morskoyzmey commented 5 years ago

@koitsu, yes, 4 digit determination is included. I'm using global chars variable as length indicator, which differs from zero only when hex ($00fc) or binary argument (%00000000) was parsed (see in code). So it won't react on labels. Just checked it twice.

freem commented 5 years ago

Sorry for taking so long to get back to everyone on this.

@koitsu, your post is informative and well-researched... It also has me wondering if I should be using ! for absolute ZP addressing, even if ca65 doesn't use that syntax. My original thoughts were biased, based on working with a disassembler that produces ca65 syntax. (I had to go back and change absolute ZP statements to use hex instead.)

@morskoyzmey, I will merge this when I get a chance (things have been pretty hectic lately, and I haven't been able to devote as much time as I'd like to everything).

Yoshimaster96 commented 1 year ago

Just wondering, will this get added anytime soon? I've been working on some disassembly stuff and have found this to be an issue as well.

koitsu commented 1 year ago

Only real workaround would be to explicitly declare the opcode + operand using .db or the like. Really not feasible. :(

Xkeeper0 commented 1 year ago

You can get away with doing it with macros, at least; that's what we did for Super Mario Bros. 2 here: https://github.com/Xkeeper0/smb2/blob/master/src/compatibility-shims.asm#L23-L34

It's not ideal but if you aren't dealing with a lot of them it can be a decent enough fix.

Yoshimaster96 commented 1 year ago

Ended up just using morskoyzmey's fork for now, but hopefully it'll be added to this one in the future.

morskoyzmey commented 1 year ago

Just wondering, will this get added anytime soon? I've been working on some disassembly stuff and have found this to be an issue as well.

Try to apply pull request from my fork above, it is ok.

freem commented 1 year ago

Try to apply pull request from my fork above, it is ok.

Indeed it is, I've done so in commit 85d3d68. Thank you :D

freem / asm6f

Absolute addressing for zero page variables #20