achan1989 / ghidra-65816

WDC 65816 processor module for Ghidra
MIT License
22 stars 3 forks source link

Immediate Loads are treated as Offsets rather than values #10

Open oziphantom opened 3 years ago

oziphantom commented 3 years ago
                             LAB_80805b+1                                    XREF[0,2]:   80805b(R), 80805b(R)  
                             LAB_80805b+2
          80805b a2 00 00        LDX        #$0x0=>LAB_80805b+1
          80805e 22 42 a7 80     JSL        !>$DAT_80a742                                    = 00A900h
          808062 22 a9 e8 80     JSL        !>$DAT_80e8a9                                    = 30C28Bh
          808066 22 5b ff 90     JSL        !>$PTR_DAT_90ff5b                                = ad20e2
          80806a 22 34 84 80     JSL        !>$PTR_LAB_808434                                = 849e22
                             LAB_80806e+1                                    XREF[0,2]:   80806e(R), 80806e(R)  
                             LAB_80806e+2
          80806e a2 cd 9b        LDX        #$0x9bcd=>LAB_80806e+1
          808071 22 34 e9 80     JSL        !>$DAT_80e934                                    = 00A000h
          808075 78              SEI
          808076 d8              CLD
          808077 c2 30           REP        #$0x30
          808079 22 2d 8f 80     JSL        !>$DAT_808f2d                                    = 586464h
                             LAB_80807d+1                                    XREF[0,2]:   80807d(R), 80807d(R)  
                             LAB_80807d+2
          80807d a2 ff 1f        LDX        #$0x1fff=>LAB_80807d+1
          808080 9a              TXS
                             LAB_808081+1                                    XREF[0,2]:   808081(R), 808081(R)  
                             LAB_808081+2
          808081 a9 00 00        LDA        #$0x0=>LAB_808081+1

rather than saying LDA #0000 it adds LAB_808081+1 and LAB_808081+2 and then references it self. It seems to think its like an ARM processor. Is there some setting or register I need to confiq in order to get these "to just be values"?

oziphantom commented 3 years ago

it also doesn't offset based upon DP, and JSR $XXXX doesn't get picked up as being an address in the current bank, even if I manually set the PBR.

oziphantom commented 3 years ago

I think I've worked out what it is doing, its adding memory references to them. So it doesn't understand that its an Immediate mode and to not add a reference.

oziphantom commented 3 years ago

I now think it is backwards, it add references to # but not to Abs. I've been staring into the sinc files but I can't see anything that tells Ghidra "make reference here" and google doesn't have anything either.

achan1989 commented 3 years ago

Can you be more explicit? "I think bytes aa bb cc at address n should decode to blah". What processor mode do you think the chip should be in, and have you set it?

oziphantom commented 3 years ago

ok so given

                    ********************************************
                    *                SUBROUTINE                *
                    ********************************************
                    SUB_8082c2                        XREF[1]: 80813d(c)  
     8082c2 e2 10      SEP     #$0x10
                    LAB_8082c4+1                      XREF[0,  8082c4(R)  
     8082c4 a0 01      LDY     #$0x1=>LAB_8082c4+1
     8082c6 a5 44      LDA     <$0x44
     8082c8 10 23      BPL     $LAB_8082ed
                    LAB_8082ca+1                      XREF[0,  8082ca(R), 8082ca(R)  
                    LAB_8082ca+2
     8082ca 29 ff 7f   AND     #$0x7fff=>LAB_8082ca+1
     8082cd 85 44      STA     <$0x44
     8082cf a5 46      LDA     <$0x46
     8082d1 8d 02 43   STA     !$0x4302
                    LAB_8082d4+1                      XREF[0,  8082d4(R), 8082d4(R)  
                    LAB_8082d4+2
     8082d4 a9 00 22   LDA     #$0x2200=>LAB_8082d4+1
     8082d7 8d 00 43   STA     !$0x4300
                    LAB_8082da+1                      XREF[0,  8082da(R)  
     8082da a2 7e      LDX     #$0x7e=>LAB_8082da+1
     8082dc 8e 04 43   STX     !$0x4304
                    LAB_8082df+1                      XREF[0,  8082df(R), 8082df(R)  
                    LAB_8082df+2
     8082df a9 00 02   LDA     #$0x200=>LAB_8082df+1
     8082e2 8d 05 43   STA     !$0x4305
                    LAB_8082e5+1                      XREF[0,  8082e5(R)  
     8082e5 a2 00      LDX     #$0x0=>LAB_8082e5+1
     8082e7 8e 21 21   STX     !$0x2121
     8082ea 8c 0b 42   STY     !$0x420b
                    LAB_8082ed                        XREF[1]: 8082c8(j)  
     8082ed 24 44      BIT     <$0x44
     8082ef 70 32      BVS     $LAB_808323
     8082f1 9c 02 21   STZ     !$0x2102
                    LAB_8082f4+1                      XREF[0,  8082f4(R), 8082f4(R)  
                    LAB_8082f4+2
     8082f4 a9 10 04   LDA     #$0x410=>LAB_8082f4+1
     8082f7 8d 00 43   STA     !$0x4300
                    LAB_8082fa+1                      XREF[0,  8082fa(R), 8082fa(R)  
                    LAB_8082fa+2
     8082fa a9 df 03   LDA     #$0x3df=>LAB_8082fa+1
a0 41 LDY #0x01
this should not make a memory reference to LAB_8082c4+1 as it intermediate
a5 44 LDA <DAT_BANK00_44
this should make a read memory reference to 00:DBR+$44 which in this particular case would be $000044 (since I'm on a SNES and Ghidra Byte mapped doesn't actually map references through the byte map, I actually would still need to change this to be $7e0044 but that is a SNES and Ghidra Problem)
10 23 BPL LAB_8082ed this is fine above accept it has a $ in it, While $ makes sense Ghidra does understand and puts 0x in front of everything which is wrong but $0x is more wrong ;) and $Label is very wrong
29 ff 7f   AND #0x7fff again no reference to any labels as this is immediate
85 44      STA     DAT_BANK00_0044     XREF[0,  000044(W)
a4 46      LDA     DAT_BANK00_0046     XREF[0,  000046(R)
8d 02 43   STA     DAT_BANK80_4302     XREF[0,  804302(W)
a9 00 22   LDA     #0x2200
8d 00 43   STA     !DAT_BANK80_4300    XREF[0,  804300(W)
a2 7e      LDX     #0x7e
8e 04 43   STX     !DAT_BANK80_4304    XREF[0,  804304(W)
a9 00 02   LDA     #$0x200
8d 05 43   STA     !DAT_BANK80_4305    XREF[0,  804305(W)
a2 00      LDX     #$0x0
8e 21 21   STX     !DAT_BANK80_2101    XREF[0,  802101(W)
8c 0b 42   STY     !DAT_BANK80_420b    XREF[0,  80420b(W)
24 44      BIT     <DAT_BANK00_0044    XREF[0,  000044(R)
70 32      BVS     LAB_808323
9c 02 21   STZ     !DAT_BANK80_2102    XREF[0,  802101(W)
a9 10 04   LDA     #0x410
8d 00 43   STA     !DAT_BANK80_4300    XREF[0,  804300(W)
a9 df 03   LDA     #0x3df

If it is possible it would be nice to have a SNES LOROM and SNES HIROM mode you can select that understands the memory map more. So for LOROM for example it auto maps <$2000 to 7e:XXXX and puts all 2000-7fff into 00:XXXX so this way all the DP variables map to a single RAM bank and then all the registers map to a single bank. And then it understands that 80:XXXX == 00:XXXX and picks one depending if your code is nominally in 80XXXX or 00XXXX but that is pie in the sky dreaming. The important thing is to stop having to press delete on every # operation to remove the wrong references, and having it auto add references to the other cases which will probably be wrong and need changing but at least some will happen correctly automatically.

Also the assembler format? Merlin?

oziphantom commented 3 years ago

forgot, its in Native mode and I've set it, but as you can see it correctly gets 8 and 16 bit loads.

achan1989 commented 3 years ago

Those do look like bugs. However I haven't worked on this project in a long time, nor do I use it at the moment, so I'm unlikely to fix it any time soon.

This is only a processor module. Memory mapping the ROM etc is an entirely different problem -- I did start trying to do that in https://github.com/achan1989/ghidra-snes-loader but I didn't get very far. I'm not exactly a fan of Java.

Not sure what the assembler format is. I can't remember if some of it is imposed by Ghidra. And I think I based some of it off the format in the WDC 65816 programming guide.

oziphantom commented 3 years ago

I hear you on Java. At the moment to get GUI in a script you have to use Jython and Swing. You can't just add TK for example. So coding Java via Python is .. well you can imagine. Still it beets installing Eclipse ;)

It just seems that one thing is backwards, # add references when everything else doesn't and that needs to be flipped and "job done". But I've looked at it and I can't see anything it does that the 6502 one doesn't.

I made a lot of scripts that then process that data and strip the one labels and add the right ones for SNES etc. But it gets very tedious as I have to go through each subroutine 1 by 1 and comb through the code to find all of them and then manually select and trigger the script on it. Rather than hit D and let Ghidra follow and disassemble all that it can find.

achan1989 commented 3 years ago

Rather than hit D and let Ghidra follow and disassemble all that it can find.

That would be the dream. I think there's a way to build a custom analyser into the processor module -- Ghidra could then set the proper mode flags automatically and everything would just work. I couldn't figure out how to do it though. It's pretty advanced functionality, the documentation of Ghidra internals isn't great for stuff like that, and I got lost reading the code. Oh well :(

oziphantom commented 3 years ago

well the processor sets the global state which basically does the MF/XF flags. Just Ghidra doesn't know that the jumps etc are "things to follow" and ignores them. But thinks the # instructions are things it should be looking at.