vhelin / wla-dx

WLA DX - Yet Another GB-Z80/Z80/Z80N/6502/65C02/65CE02/65816/68000/6800/6801/6809/8008/8080/HUC6280/SPC-700/SuperFX Multi Platform Cross Assembler Package
Other
546 stars 98 forks source link

65816: Problems with Direct Page, address sizes, and indirection #581

Closed rondnelson99 closed 1 year ago

rondnelson99 commented 1 year ago

On 6502 etc, I feel like WLA's approach to address sizes makes a lot of sense. If it doesn't know the location of the label, it defaults to zeropage at assemble time (unless you specify .16BIT) and then errors at link time if that's not the case. The .w instruction suffix works great to specify what I actually want, and it's good to keep these addressing modes in headspace anyway. On 65816, the movable Direct Page makes thing messier.

I can't speak for all programmers, but the main way I use the Direct Page is as a movable zeropage for quick access to various memory regions. For instance, on SNES, I might move the DP to $2100 (The PPU IO area) to make a bunch of PPU writes. For this use-case, the bottom 8 bits of the D register are always zero since most DP instructions take a cycle longer it it's non-zero. However, I also access the PPU registers using long or absolute addressing, so I still use 16-bit labels for the PPU addresses.

I would like to use e.g. sta.b PPUCTRL (where PPUCTRL = $2115) to write to a register when the directpage points to $2100, but this produces an error at link time since PPUCTRL is a 16-bit value. As a workaround, I use sta <PPUCTRL, which generally does what I want, although I find the syntax odd since PPUCTRL is the actual address I'm writing to. Personally I think the linker should just truncate the address, and either warn or just silently continue since it's entirely plausible that the DP is set so that address does in fact get used.

The real problem is when indirection is used. Take sta (<SOME_LABEL) where SOME_LABEL is, say $0205, and D=$0200. I want to use the Direct Indirect addressing mode (love these naming conventions lol), on the address $0205 (e.g. D+ $05). It seems that when it sees the < operator (lobyte() does the same thing), it decides that the parentheses are for order of operations, rather than indicating an addressing mode. Besides making labels that don't reflect the memory's actual location (which I'd really like to avoid) or .dbing opcodes, I'm not sure how to avoid this.

The three potential solutions I thought of were these:

  1. Make the linker truncate 16-bit labels on 65816 when .b is used (the best solution IMO, but could be annoying for other uses of the Direct Page)
  2. Somehow change the parser to read my second example the way I describe (although that might end up messy, and it ignores the main issue, although that could just be a design choice, not a real problem)
  3. Have the assembler/linker keep track of the D register through directives the same way it keeps track of the m and x flags (seems like a huge project, and would be annoying if the values need to be constant at assemble time)

Maybe you have other ideas. Sorry for being so needy with your project, I can stop with these issues if they're a pain.

rondnelson99 commented 1 year ago

Since I'm kind still a novice 65816 programmer, I sought out some opinions on information on the SNES development Discord.

Kulor said that they often use the directpage to access entity structs in RAM. They use code which calculates the address for a struct's base at runtime, and then indexes to the desired struct element using symbols which contain not the memory address of the struct elements, but rather the offset from the struct's base.

Nova said that "silent truncation could be a hard-to-diagnose trap for people to run into", and would much prefer explicit truncation for use-cases like these. Like I said in my original post, I can see how this could be a sensible design choice.

I don't think either of these people use WLA, but I figured I'd get some more experienced opinions since I'm so new to the SNES scene, and idk how much you have or haven't used the 65816.

vhelin commented 1 year ago

Hi!

1 might open the door for problems, like not reporting an error when we don't want to truncate a 16-bit value. Adding a flag to do so would be possible, but that'd then be a global setting.

2 sounds like we have an issue in the parser. We have this in the instructions array

{ "STA (<x)", 0x92, 0xA, 0 },

... so the parser should detect such instruction. I'll debug this...

3 is not really possible as the linker doesn't get any code, only bytes, so it has no concept of register D. The assembler could do this, though.

I have to admit that I'm not a SNES programmer and all I know about the 65816 is reading a couple of instruction tables. :)

Explicit truncation sounds like a good solution. I need to this more about all this...

rondnelson99 commented 1 year ago

Thanks a bunch for looking into it. Yeah, having thought about it more, I agree that explicit truncation probably does make the most sense. If I get a chance, I'll see if I can get I can get that special path for sta (<x) to trigger in other scenarios, maybe there was something problematic about the context I was using it in.

vhelin commented 1 year ago

About 2, I think we have a problem here

{ "STA (<x)", 0x92, 0xA, 0 }, { "STA (x)", 0x92, 0xA, 0 },

So two different ways of using the same instruction. Are we supporting two different 65816 programming styles here, quite probably, but I cannot remember the origins of the "STA (<x)" form...

As the parser processes "STA (<x)" first it'll pattern match "STA (<SOME_LABEL)" and expects SOME_LABEL to be an 8-bit value. To use the lobyte operator '<' you'd need to write that "STA (<(<SOME_LABEL))" to get it to work as you want.

I see the same pattern repeats in other instructions

{ "AND <x", 0x25, 0xA, 0 }, { "AND |?", 0x2D, 2, 0 }, { "AND >&", 0x2F, 3, 0 }, { "AND x", 0x25, 0xA, 2 }, { "AND ?", 0x2D, 2, 1 }, { "AND &", 0x2F, 3, 0 },

and if I removed '<' etc. from the instructions array then, I guess, many wla-65816 projects would not assemble any more.

Currently I see two ways to address this issue:

  1. add support for two 65816 instruction arrays, a being the current one and b being a minus all those '<' etc. versions of the instructions, and let the user choose between these two
  2. modify the 65816 instructions parsers to skip '<' etc. containing instructions in the array when the assembler is given a special flag

Implementing 1 would take a lot of effort, 2 would be quite easy/fast to do, but would add more code to the instruction parsers. As my free time is quite limited I'm thinking about taking route 2. :)

rondnelson99 commented 1 year ago

Interesting. That certainly sounds like an odd programming style, but I guess WLA has been around long enough to see all sorts of things.

As for the two options you just mentioned, am I correct that these would be the same from the user's perspective? Both give an option for the assembler to use the < as an arithmetic operator rather than just being an alternate syntax (effectively ignoring the <).

I guess it's up to you which internal implementation makes the most sense.

Ramsis-SNES commented 1 year ago

I would like to use e.g. sta.b PPUCTRL (where PPUCTRL = $2115) to write to a register when the directpage points to $2100, but this produces an error at link time since PPUCTRL is a 16-bit value. As a workaround, I use sta <PPUCTRL, which generally does what I want, although I find the syntax odd since PPUCTRL is the actual address I'm writing to. Personally I think the linker should just truncate the address, and either warn or just silently continue since it's entirely plausible that the DP is set so that address does in fact get used.

As a more experienced 65816 programmer using WLA DX, I can't agree with this. Using sta <PPUCTRL, you are, from the assembler's and CPU's perspective, in fact NOT writing to (16-bit) address PPUCTRL but simply to a random DP location. But, that's exactly what you wanted in the first place, since you set the DP register to $2100 in order to access SNES PPU registers that way. :-)

Like many others, I've gotten used to using the STA <x syntax for DP addressing. IMHO, it actually helps code readability, too. I don't see the need to change anything in this regard, especially with the risk of breaking older projects involved.

The real problem is when indirection is used. Take sta (<SOME_LABEL) where SOME_LABEL is, say $0205, and D=$0200. I want to use the Direct Indirect addressing mode (love these naming conventions lol), on the address $0205 (e.g. D+ $05). It seems that when it sees the < operator (lobyte() does the same thing), it decides that the parentheses are for order of operations, rather than indicating an addressing mode.

It's unclear to me what addressing mode you're referring to exactly. The closest I can think of is either Direct Page indirect (cf. page 108 in Programming the 65816) or Direct Page indirect indexed with Y (cf. page 197 in Programming the 65816), which is not what sta (<SOME_LABEL) reflects, though, as it's missing the Y indexing. It would be helpful if you could provide example code with explanatory comments on the instruction(s) in question. Personally, I don't recall ever running into any problems using indirect addressing with WLA DX.

Thanks! :-)

vhelin commented 1 year ago

Thanks @Ramsis-SNES ! I'll put this on hold on my behalf...

EDIT: I'll still think, though, how to add support for that explicit truncation... Maybe instead of .b you'd have .1, instead of .w you'd have .2...

rondnelson99 commented 1 year ago

Yeah, thanks a lot @Ramsis-SNES, it's definitely good to have an experienced programmer chime in. I get what you mean about the whole explicit truncation thing. In my mind, the .b extension seemed like it should already serve as explicit truncation, but since there are so many other ways to use the direct page, that doesn't really make sense, so I agree with you on your first point.

As for your second point, I was trying to use the non-indexed indirect direct page mode. I see it described here: http://www.6502.org/tutorials/65c816opcodes.html#5.9 I had a pointer in memory at the direct page, and I wanted to read the byte it pointed to.

Finally, I should note that I'm finding this issue where it assembles lda (<label) as lda <label to be a little more elusive. I was getting issues the first time I ran into that, but I've since found that usually, it works just how I want it to. I'm really not sure what triggers the issue. I guess it's possible that I was doing something else wrong when I first wrote the issue, but I tried changing that line to all sorts of things and then changing it back, and the issue kept popping up. I haven't tried to use it too many times since after the initial issue, I switched to using .db with the opcodes for a while since I thought it was still broken. I'll work on tearing those out and I'll see if anything breaks

vhelin commented 1 year ago

After thinking about this a bit more I think I'll add a switch that'll make wla-65816 ignore those instructions that have '<', '|' and '>' hard coded into the instructions. That way we don't technically/probably force everybody use that syntax...

rondnelson99 commented 1 year ago

Ok, I certainly trust whatever you see as the right thing to do here, but to give an update with my project, I'll just say the problem I started writing this issue about (where lda (<label) gets assembled as lda <label) have oddly disappeared. I've torn out all the lines I'd written to circumvent the problems and everything still seems to assemble just the way I want. I've been on vacation this week so I still haven't had time to look at it in more detail. Maybe it depends on the type of label that's user on the line (it's working now in a RAMSECTION but it may have been in an ENUM or something previously) or maybe I've somehow gone insane and that's was never any issue to begin with. Suffice to say, this is really baffling be and I don't really know what to say about it, but I currently am able to use syntax like lda (<label) with 16-bit labels and it does truncate and assemble as I'd expect.

vhelin commented 1 year ago

I implemented the skipping of instructions that contain hard coded '<', '>' and '|', and wondered that why by default they are skipped as well. Well, I found this in the instruction parser loop. Had totally forgotten it, LOL. :)

kuva

By default they are skipped. If you issue .WDC in your assembly code then they are not. And by default '<' is "get low byte" there, the instruction parser should not eat it, the calculation parser should spot it...

If you run into an issue, ~again, where "LDA (<label)" gets assembled as "LDA <label", let us know!

vhelin commented 1 year ago

As the issue seems to be gone I'll close this, but please do reopen the issue if it comes back!