marhel / r68k

A m68k emulator in rust - let r68k = musashi.clone();
MIT License
71 stars 7 forks source link

Implement DIVS #3

Closed marhel closed 8 years ago

marhel commented 8 years ago

Please implement the instruction for Signed Divide. For more information, please read the instruction contribution guidelines

marhel commented 8 years ago

I can take this now

marhel commented 8 years ago

I've noticed two issues in Musashis cycle counts for DIVS and DIVU. For one, it uses the worst-case timings always. EASy68k seems to use the algorithm from an old usenet post, if looking at the EASy68k source (note that this is not the official sources, which as far as I can see are only available as a zip through the EASy68k website, but at a glance it's the same implementation)

Secondly, for the division-by-zero trap, it adds an additional 38 cycles to the worst case, which seems wrong. I noted something similar when looking at the CHK instruction trap, which does the same thing (includes instruction cycles in exception case). It didn't seem clearly wrong in the case of CHK, but in case of DIVS/DIVU it is obvious, I think, that an division by zero wouldn't take 158+38+ea cycles, just 38+ea.

This is also confirmed when looking at cycles reported in EASy68k for the div-by-zero case.

We need to decide if we care about more precise cycle counts for DIVS/DIVU, versus raw performance, and if we do that, how to handle the fact that some Musashi timings then will no longer match (unless we also patch Musashi). Some exception-rule in the QC-tests, perhaps.

I think it's fairly simple to correct the Trap timings for CHK and DIVS/DIVU, and perhaps this is also simple to patch into Musashi, as I think it's possible to "consume" cycles from within the op-implementation.

emoon commented 8 years ago

That is interesting for sure. I'm trying to figure out how UAE does this and found this

/* DIVS.W Dn,Dn */
uae_u32 REGPARAM2 CPUFUNC(op_81c0_0)(uae_u32 opcode)
{
    uae_u32 srcreg = (opcode & 7);
    uae_u32 dstreg = (opcode >> 9) & 7;
{{  uae_s16 src = m68k_dreg (regs, srcreg);
{   uae_s32 dst = m68k_dreg (regs, dstreg);
    if (src == 0) {
        divbyzero_special (1, dst);
    m68k_incpc (2);
        Exception (5);
        goto l_1278;
    }
    CLEAR_CZNV ();
    if (dst == 0x80000000 && src == -1) {
        SET_VFLG (1);
        SET_NFLG (1);
    } else {
        uae_s32 newv = (uae_s32)dst / (uae_s32)(uae_s16)src;
        uae_u16 rem = (uae_s32)dst % (uae_s32)(uae_s16)src;
        if ((newv & 0xffff8000) != 0 && (newv & 0xffff8000) != 0xffff8000) {
            SET_VFLG (1);
            SET_NFLG (1);
        } else {
            if (((uae_s16)rem < 0) != ((uae_s32)dst < 0)) rem = -rem;
    CLEAR_CZNV ();
    SET_ZFLG (((uae_s16)(newv)) == 0);
    SET_NFLG (((uae_s16)(newv)) < 0);
            newv = (newv & 0xffff) | ((uae_u32)rem << 16);
            m68k_dreg (regs, dstreg) = (newv);
        }
    }
    m68k_incpc (2);
}}}l_1278: ;
return 142 * CYCLE_UNIT / 2;
}
emoon commented 8 years ago

It seems that they use 142 as cycle count (cycle unit seems to be 512, unsure what that is)

emoon commented 8 years ago

regarding qc test we could have one variant that is like qc_no_cycle_check or something like that if we want. I think we should go with an implementation that we think is correct and even if Musashi does something different that shouldn't stop us (imo)

That being said I'm not sure what the best approach would be here.

marhel commented 8 years ago

Hm! Doesn't the fragment

    if (src == 0) {
        divbyzero_special (1, dst);
        m68k_incpc (2);
        Exception (5);
        goto l_1278;
...
l_1278: ;
    return 142 * CYCLE_UNIT / 2;

also suggest they consume at least 142 cycles, even if div-by-zero? Unless divbyzero_special consume a negative amount of cycles?

emoon commented 8 years ago

I checked the code and they doesn't seem to do anything with the cycles in it, they just set status registers of the CPU according to the CPU type (020+ etc)

marhel commented 8 years ago

I've pushed DIVS/DIVU now, with behaviour like Musashis. I'll keep these issues open as a reminder to correct the Trap timings later. I'll open a new issue on accurate cycle emulation for DIVS/DIVU.

emoon commented 8 years ago

Sounds good.

marhel commented 8 years ago

Closing, but see #68.