mre / mos6502

MOS 6502 emulator written in Rust
BSD 3-Clause "New" or "Revised" License
82 stars 15 forks source link

Panic when executing `jmp ($ffff)` #86

Closed omarandlorraine closed 7 months ago

omarandlorraine commented 1 year ago

While executing jmp ($ffff), I get the following panic:

thread 'main' panicked at 'range end index 65537 out of range for slice of length 65536', /home/sam/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mos6502-0.4.0/src/memory.rs:80:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The instruction reads two bytes from memory, and uses that two-byte value as a pointer to the jump's target. The implementation is trying to read a slice from the memory to get this pointer, but this fails when the real chip would read from address $ffff, wrapping over from the end of the address space to the beginning.

I appreciate that's the kind of thing we're vanishingly unlikely to run into in the wild, though!

omarandlorraine commented 1 year ago

When I think about it some more, it's really because the emulator does not implement the Indirect Jump Bug.

The real deal wouldn't read the pointer from bytes $ffff and $0000, but from bytes $ffff and $ff00. This is because while incrementing the memory address register, the carry does not propagate to the high byte.

(This is again a difference between the NMOS and CMOS varieties; CMOS does fix this problem)

mre commented 1 year ago

Thanks for the detailed analysis. What would you suggest? I wonder if at some point we need to implement different modes for NMOS and CMOS.

omarandlorraine commented 1 year ago

What would you suggest?

I would suggest re-writing around src/cpu.rs line 151. The current code is almost correct for the CMOS variant. But for the NMOS variant, the necessary code is quite finicky but I'll see what I can put together.

I wonder if at some point we need to implement different modes for NMOS and CMOS.

Yes, and other variants too, maybe? I could contribute the Rockwell bit manipulation instructions for example. For my purposes, it would be good to be able to instantiate several different 6502 varieties in my one program, so I think that excludes conditional compilation as a possibility. Instead, I'd suggest making an enum of all supported variants, and having this as a member of the Cpu struct. Then when we instantiate the Cpu, we pick the correct enum variant. And as we run the Cpu, here and there we can check the enum to see if we want to emulate the Indirect Jump Bug, the Decimal Mode, this thing or that thing.

Something like this:

pub enum Mos6502Variant {
    RevisionA,
    Nmos,
    Cmos,
    Ricoh2a03,
    Rockwell
}

And instead of this method:

    pub fn new(memory: M) -> CPU<M>

on line 49 of src/cpu.rs we'll need, you know,

or something. But I haven't figured out what to do about the OPCODES array in src/instruction.rs. We need to find a different solution than that if we want to support multiple, instruction sets that are nearly the same. Maybe a match statement.

mre commented 1 year ago

I gave this some thought and here's an alternative I came up with:

Instead of an enum for the different MOS6502 variants, we could define them as individual types. This allows us to harness Rust's generics to define specific behaviors or opcode sets for each variant.

So

pub struct Nmos;
pub struct RevisionA;
pub struct Cmos;
pub struct Ricoh2a03;
pub struct Rockwell;

We could then define a trait that provides a method to fetch the opcode implementation for a given byte value. Each variant would implement this trait to provide its unique set of opcodes.


pub trait OpcodeSet {
    fn get_opcode(byte: u8) -> Option<Opcode>;
}

impl OpcodeSet for Nmos {
    fn get_opcode(byte: u8) -> Option<Opcode> {
        match byte {
            // ... Add opcodes specific to Nmos
            _ => None,
        }
    }
}

There would of course be duplication, but each set of opcodes would be nicely encapsulated for each variant (and I don't expect too many changes once a variant got implemented).

After that, our CPU could be defined with a generic parameter representing the variant.

pub struct CPU<V, M> {
    variant: PhantomData<V>,  // Used to associate the variant with CPU without consuming space
    memory: M,
    // ... other fields
}

impl<V: OpcodeSet, M> CPU<V, M> {
    pub fn new(memory: M) -> Self {
        CPU {
            variant: PhantomData,
            memory,
            // ... initialize other fields
        }
    }

    pub fn execute_opcode(&self, byte: u8) {
        if let Some(opcode) = V::get_opcode(byte) {
            // Execute the opcode
        }
    }
}

We could instantiate the CPU for a specific variant as follows:

let cpu: CPU<Nmos, _> = CPU::new(memory);

Benefits:

Downsides

I think the type safety and encapsulation of these variants would still be worth it, but I'd love to hear your thoughts on this. I'm very much open to discuss different design options.

What do you think?

omarandlorraine commented 1 year ago

Huh, I thought I'd already replied to this.

I like your idea better than mine, and I think it's a good way to code up a clear separation between "opcode" and "instruction". A variant might have the same instruction on a different opcode. So I think that your idea of an OpcodeSet trait is pretty much spot on.

But then there's the "Other Stuff". This is variation between the models other than what's in the instruction set. Things like

  1. Have we got the Indirect Jump Bug, yes or no
  2. Do we have decimal mode, yes or no
  3. Do RMW instructions write the original value before overwriting it again with the correct value, yes or no
  4. From what address do absolute indexed addressing modes do a spurious read, if the address calculation crosses into a different page and needs to take an extra cycle for carry propagation?

Points 1 and 2 are the important, software compatibility breaking ones, and points 3 and 4 matter for cycle accuracy only and so are less important in my view.

And so far as I can see there are two big ways we can deal with these questions.

A. We have three implementations of add_with_carry, and let the OpcodeSet thing call the appropriate one. Same with subtract_with_borrow and AddressingMode::Indirect. B. Make sure the OpcodeSet trait has functions like fn enable_decimal_mode() -> bool and fn buggy_indirect_jump() -> bool. Then the CPU calls these in the appropriate places.

Personally I prefer Option A, but what are your thoughts?

mre commented 1 year ago

Yes, I agree.

Given our emphasis on the clear separation between "opcode" and "instruction" and the various quirks among different variants, Option A seems more fitting. While it might introduce some code duplication, the benefits of explicitness, performance, and modularity seem worth the trade-off.

As you mentioned, we can use shared utility functions to reduce code duplication.

omarandlorraine commented 1 year ago

Alright, in that case, I guess the next step is to work on a branch which moves the Decimal Mode option into an OpcodeSet thing. That removes the "decimal" feature (we're no longer using conditional compilation for this) and correspondingly puts a method add_with_carry_no_decimal on the CPU. Then we've got feature parity with the existing master branch, but we're taking the new approach to variant emulation.

You want to take that on? Then when that's done, I'll make a start on:

mre commented 1 year ago

Sure can do, even though I don't know when I'll find the time to do it. Feel free to take a stab at it first, and maybe we can collaborate on the PR. Alternatively, I'll try to make some time for it, but no guarantees. 😅

mre commented 7 months ago

I think this can be tackled now that we have support different chip variants. @omarandlorraine, just in case you still want to submit a patch for jmp on NMOS.