stella-emu / stella

A multi-platform Atari 2600 Emulator
https://stella-emu.github.io
GNU General Public License v2.0
618 stars 112 forks source link

false read issue with NOP #131

Closed SpiceWare closed 7 years ago

SpiceWare commented 7 years ago

After fixing a Harmony/Melody driver issue with this sequence of instructions:

BNE loop
LDA #DATA_STREAM

which caused the driver to enter "Fast Fetch" mode if the branch was taken due to the "false read" of the LDA #, Chris mentioned that the same problem most likely occurs with the new Fast Jump. Sure enough, I built a new test with this sequence:

BNE loop
JMP $0000

and it crashes the demo in both Stella (Stella itself doesn't crash) and on hardware. Then I changed that code to this:

BNE loop
NOP
JMP $0000

expecting it to work. And it did on my 2600, but it still crashed in Stella. If I changed that NOP to NOP 0, LDY #4, etc then it runs fine in Stella.

If I trap Stella on the NOP instruction:

screen shot 2017-05-07 at 7 26 25 pm

Then set a breakpoint in Xcode on CartridgeCDF::peek(), when I do a single step for the NOP I encounter two breaks - one for peek($F0B6), followed by a peek($F0B7).

That second peek causes the cartridge class to think the JMP $0000 is being executed, so enters "fast jump mode" which triggers the next two peeks to be returned from the jump datastream. To confirm that's the source of the crash:

1) just finished the NOP instruction

screen shot 2017-05-07 at 7 16 54 pm

2) the Jump Datastream is pointing at location 4 of Display Data RAM:

screen shot 2017-05-07 at 7 17 00 pm

3) Display Data starts at $0800, so if the next 2 peeks are overridden for "fast jump" it will return $BA then $F0.

screen shot 2017-05-07 at 7 17 05 pm

4) After clicking on STEP, the PC is now at F0B8, so the peek returned the $BA, instead of the $4C, and thus a TSX instruction was executed. This can be be confirmed by seeing that the X register, which used to contain $0F, now contains $FF (same as the stack pointer) and the PC only advanced by 1 byte.

screen shot 2017-05-07 at 7 17 21 pm

Since the NOP version works correctly on hardware, which will crash if a false read of the JMP instruction occurs, I don't think the NOP should be issuing a false read.

The ZIP includes 4 versions of the demo, the filename tells the instruction sequence for the test fastjump.zip

thrust26 commented 7 years ago

IIRC and according to the current Stella code, there is always an extra, unused read after every single byte instruction (see M6502_IMPLIED in M6502.m4).

So if you replace NOP with e.g. CLC, does this have the same effect on your hardware?

DirtyHairy commented 7 years ago

Afaik, the 6502 does a bus access on every cycle by construction​, so the second read should be fine. However, the address might be wrong, I'll check with the MOS manual later.

DirtyHairy commented 7 years ago

I couldn't find anything in the MOS manual on a first glance, but I just checked the visual6502 simulator and it does the second read access to the next instruction (0x0001 if the NOP was at 0x0000), so it looks as if Stella is correct here (if I read the source correctly).

SpiceWare commented 7 years ago

I got to thinking last night that perhaps what's happening, because of the lack of typical data signals on the 2600's cartridge port, that the Harmony/Melody driver does this:

  1. normal read of NOP occurs, H/M sees peek, fetches $ea, puts it on the bus
  2. false read of JMP occurs - H/M sees peek, fetches $4c, puts it on the bus
  3. real read of JMP occurs - H/M doesn't "see it" because it's the exact same address. The $4c is still on the bus from #2, so the 6507 sees the JMP as expected.

I don't quite follow how the driver works (as I'm not familiar with ARM assembly) so I've posted that over at AA to see if Chris concurs.

DirtyHairy commented 7 years ago

That sounds pretty convincing. Unless the driver is counting cycles, it can only trigger on address bus transitions, so it won't see the second read. If this is the case, we could (and should) emulate this correctly :smirk: (I'm also thinking about DPC+ fast fetch, should be the same issue there).

thrust26 commented 7 years ago

Can we close this one?

SpiceWare commented 7 years ago

Haven't heard back from Chris yet.

SpiceWare commented 7 years ago

Chris agrees that Stella's NOP implementation is correct. More info from Chris' blog entry:

1) Wait for an address request from the Atari (A12 high). 2) Read address bus (A0-A11). 3) Fetch data from flash memory at the requested address (may perform bankswitching here). 4) Assert data on the data bus (D0-D7). 5) Wait until there is a change of address (A0-A12). 6) Repeat forever.

Step 5 matches my thoughts above on why it worked on the Harmony.