Incorrect post byte for indexed addressing mode

ceharris commented 3 months ago

Indexed addressing with a large-ish displacement seems to be interpreted incorrectly by the assembler.

org 0
leax -200,x       ; should be 30 89 FF38 (X with 16-bit displacement), but output is 30 81 FF38 (autoincrement by 2)
leax 200,x        ; should be 30 89 00C8 (X with 16-bit displacement), but output is 30 80 C8 (autoincrement by 1)
ldx 16,s          ; should be AE E8 10 (S with 8-bit displacement), but output is AE 80 10 (autoincrement by 1)

Interestingly:

ldx -16,s         ; output is AE 70 (S with 5-bit displacement) as expected
ldx 15,s          ; output is AE 67 (S with 5-bit displacement) as expected

gorsat commented 3 months ago

Yes, I messed this up when I was adding PCR support. I snuck a fix for this into the first commit for #4 (https://github.com/gorsat/6809/commit/40c3bcd7a2d075bab25a6437b22613c5ced89821). Can you pull the latest and let me know if it looks okay? I'm traveling right now and can't try your examples until this evening.

The problem was only with 16-bit offsets so that's why the small offsets worked. Interestingly, the case where I found this is exactly the one you used (LEAX -200,X).

Note that the SETDP work in #4 isn't done yet. I expect to get that in tonight or tomorrow.

gorsat commented 3 months ago

Oops. Not quite. I didn't handle signed 8-bit offsets correctly. This is mostly fixed but there is a problem with the boundary case where offset == -128. I'll have to look at this later this evening.

ceharris commented 3 months ago

Your second test case (in issue5.asm) seems suspect.

x2  leax    200,x   ; should be 30 88 C8

You're using an 8-bit displacement in this case, but 200 is too large to be represented in an 8-bit signed integer. As a result, the displacement that would be applied by the resulting instruction is -56.

gorsat commented 3 months ago

Yep. I eventually figured that out. See above. I still have the corner case to fix but can't get to it until later.

gorsat commented 3 months ago

I hope you're not stuck waiting on me for this one. There is still a low level bug in here having to do with how the assembler decides it needs to use 16 bits vs 8 bits for a given number. I'll have to dig into it a bit further to figure out how to remedy it without screwing up anything else.

ceharris commented 3 months ago

I'm not stuck. 😄

I did take a look at commit 9a38759 and noticed something a little suspect.

In test case x6, you're testing at what should be the boundary for 8-bit signed displacement, but you're expecting a 16-bit displacement. Since -128 = 0x80 hex, you'll want the output here to be 30 88 80.

x6  leax    -128,x  ; should be 30 89 FF80

gorsat commented 3 months ago

Let me know if the latest revision works for you: a5943a4

ceharris commented 3 months ago

Here's a fun corner case.

   1 0000 30   89   00F0                    LEAX     $-10,x    ; should be 30 10
   2 0004 30   10                           LEAX     -16,x
   3 0006 30   89   00F1                    LEAX     $-F,x     ; should be 30 11
   4 000A 30   11                           LEAX     -15,x
   5 000C 30   89   00FF                    LEAX     $-1,x     ; should be 30 1F
   6 0010 30   1F                           LEAX     -1,x
   7 0012 30   01                           LEAX     $1,x
   8 0014 30   01                           LEAX     1,x
   9 0016 30   0F                           LEAX     $F,x
  10 0018 30   0F                           LEAX     15,x

This use of signed hexadecimal literals shows up in ExBasROM.asm (the CoCo extended BASIC source).

Since you don't have a lot of test coverage, and since there's an Intel hex output and listing file for ExBasROM.asm, I used your assembler and used srec_cmp to compare the Intel hex output. Then I poured over the corresponding listings to examine the differences. There's really not many differences, but because of the above difference, a lot of subroutine entry points get shifted quite a bit, so there's a lot of noise.

gorsat commented 3 months ago

The latest changes to low level unsigned->signed conversion broke how these "$-" numbers were handled. The fix I've attempted here is to just tokenize the input as if it were written "-$" instead of "$-" (accounting for repeated '-' symbols). This causes negative hex numbers to be treated the same way as decimal negatives (and preserves the information about the sign beyond the parser, which is the key in this case).

I actually use ExBasROM.asm as one of my main tests before making a commit. I assemble/run it and then run a big basic program within it. It seems somewhat miraculous that the thing worked fine with this bug in place.

At any rate, I need to improve the testing. I think I'll automate what you've described and include it in the built-in tests. I'll open an issue for that.

ceharris commented 3 months ago

I think you can finally put a fork in this one. I see only one difference between your assembler's output and whatever assembler produced the Intel hex and listing outputs for ExBasROM.asm, and it looks like it's due to a bug in that other assembler.

5075 fcc7 91 00                        CMPA $'$'           DOLLAR SIGN?

That last byte should be 24, as output by your assembler.

gorsat / 6809

Incorrect post byte for indexed addressing mode #5