SingleStepTests / z80

Repository for JSMoo-based Z80 JSON test files for Z80
MIT License
8 stars 1 forks source link

Instruction timing discrepancies #3

Open itsmevjnk opened 1 week ago

itsmevjnk commented 1 week ago

(crossposted from here)

While using the generated Z80 tests for testing my Z80 emulator, I discovered a couple of discrepancies in the timings of certain instructions tested in the generated cases in comparison to floooh's Z80 netlist simulator analysis:

LD (IX+d)/(IY+d),n

floooh's writeup states that the offset computation cycles are overlaid on the n value read cycle, with the latter immediately following the d offset read cycle:

┌─────┬────┬──────┬──────┬────┬────┬──────┬────┬──────┬──────┬──────┐
│  T  │ M1 │ MREQ │ RFSH │ RD │ WR │ AB   │ DB │ PC   │ IX   │ WZ   │
├─────┼────┼──────┼──────┼────┼────┼──────┼────┼──────┼──────┼──────┤
│ 9/0 │    │      │      │    │    │ 0006 │ 36 │ 0006 │ 1000 │ 5555 │ <== memory read to load 'd'
│ 9/1 │    │ MREQ │      │ RD │    │ 0006 │ 36 │ 0007 │ 1000 │ 5555 │
...
│11/1 │    │      │      │    │    │ 0006 │ 03 │ 0007 │ 1000 │ 5555 │
│12/0 │    │      │      │    │    │ 0007 │ 03 │ 0007 │ 1000 │ 5555 │ <== memory read to load 'n'
│12/1 │    │ MREQ │      │ RD │    │ 0007 │ 03 │ 0008 │ 1000 │ 5555 │
...
│14/1 │    │      │      │    │    │ 0007 │ 0B │ 0008 │ 1000 │ 5503 │
│15/0 │    │      │      │    │    │ 0007 │ 0B │ 0008 │ 1000 │ 5503 │ <== 2 remaining extra clock cycles
...
│16/1 │    │      │      │    │    │ 0000 │ 0B │ 0008 │ 1000 │ 1003 │
│17/0 │    │      │      │    │    │ 1003 │ 0B │ 0008 │ 1000 │ 1003 │ <== LD (HL),n continues

However, the generated tests expect the n read cycle to be performed at the end of the offset computation cycle; in other words, the extra clock cycles above are moved to between d and n loading cycles:

/* offset read cycle */
      [
        3848,
        null,
        "----"
      ],
      [
        3848,
        null,
        "r-m-"
      ],
      [
        3848,
        59,
        "----"
      ],
/* 2 extra cycles */
      [
        null,
        null,
        "----"
      ],
      [
        null,
        null,
        "----"
      ],
/* value read cycle */
      [
        3849,
        null,
        "----"
      ],
      [
        3849,
        null,
        "r-m-"
      ],
      [
        3849,
        185,
        "----"
      ],

EX (SP),HL/IX/IY

In floooh's writeup, this instruction essentially pops a 16-bit value off the stack into WZ, then pushes HL (or IX/IY) onto the stack before setting it to WZ, creating an "SP+0 read -> SP+1 read -> SP+1 write -> SP+0 write" address sequence (note the AB column on the second half-cycle of the first memory read/write cycles):

┌─────┬────┬──────┬──────┬────┬────┬──────┬────┬──────┬──────┬──────┬──────┐
│  T  │ M1 │ MREQ │ RFSH │ RD │ WR │ AB   │ DB │ PC   │ HL   │ SP   │ WZ   │
├─────┼────┼──────┼──────┼────┼────┼──────┼────┼──────┼──────┼──────┼──────┤
│ 5/0 │    │      │      │    │    │ 00FE │ E3 │ 000B │ 4321 │ 00FE │ 5555 │ <== memory read
│ 5/1 │    │ MREQ │      │ RD │    │ 00FE │ E3 │ 000B │ 4321 │ 00FE │ 5555 │
...
│ 7/1 │    │      │      │    │    │ 00FE │ 34 │ 000B │ 4321 │ 00FE │ 0034 │ <== Z: low byte from stack
│ 8/0 │    │      │      │    │    │ 00FF │ 34 │ 000B │ 4321 │ 00FE │ 0034 │ <== memory read
│ 8/1 │    │ MREQ │      │ RD │    │ 00FF │ 34 │ 000B │ 4321 │ 00FE │ 0034 │
...
│10/1 │    │      │      │    │    │ 00FF │ 12 │ 000B │ 4321 │ 00FE │ 1234 │ <== WZ: 16 bit value from stack
│11/0 │    │      │      │    │    │ 00FF │ 12 │ 000B │ 4321 │ 00FE │ 1234 │ <== extra clock cycle
│11/1 │    │      │      │    │    │ 00FE │ 12 │ 000B │ 4321 │ 00FF │ 1234 │ <== SP incremented
│12/0 │    │      │      │    │    │ 00FF │ 12 │ 000B │ 4321 │ 00FF │ 1234 │ <== memory write (L => stack)
│12/1 │    │ MREQ │      │    │    │ 00FF │ 43 │ 000B │ 4321 │ 00FF │ 1234 │
...
│14/1 │    │      │      │    │    │ 00FE │ 43 │ 000B │ 4321 │ 00FE │ 1234 │
│15/0 │    │      │      │    │    │ 00FE │ 12 │ 000B │ 4321 │ 00FE │ 1234 │ <== memory write (H => stack)
│15/1 │    │ MREQ │      │    │    │ 00FE │ 21 │ 000B │ 4321 │ 00FE │ 1234 │
...
│17/1 │    │      │      │    │    │ 00FE │ 21 │ 000B │ 4321 │ 00FE │ 1234 │

Whereas the generated tests expect an "SP+0 read -> SP+1 read -> SP+0 write -> SP+1 write" sequence instead:

/* SP+0 read */
      [
        24218,
        null,
        "----"
      ],
      [
        24218,
        null,
        "r-m-"
      ],
      [
        24218,
        76,
        "----"
      ],
/* SP+1 read */
      [
        24219,
        null,
        "----"
      ],
      [
        24219,
        null,
        "r-m-"
      ],
      [
        24219,
        137,
        "----"
      ],
/* extra cycle */
      [
        null,
        null,
        "----"
      ],
/* SP+0 write */
      [
        24218,
        null,
        "----"
      ],
      [
        24218,
        39,
        "-wm-"
      ],
      [
        24218,
        null,
        "----"
      ],
/* SP+1 write */
      [
        24219,
        null,
        "----"
      ],
      [
        24219,
        178,
        "-wm-"
      ],
      [
        24219,
        null,
        "----"
      ],

RLD/RRD

In floooh's analysis there's a full 4 extra clock cycles between the memory read and write cycles:

┌─────┬────┬──────┬──────┬────┬────┬──────┬────┬──────┬──────┬──────┬──────────┐
│  T  │ M1 │ MREQ │ RFSH │ RD │ WR │ AB   │ DB │ PC   │ AF   │ HL   │ Flags    │
├─────┼────┼──────┼──────┼────┼────┼──────┼────┼──────┼──────┼──────┼──────────┤
│ 9/0 │    │      │      │    │    │ 1000 │ 6F │ 0009 │ 5655 │ 1000 │ sZyHxVnC │ <== memory read
│ 9/1 │    │ MREQ │      │ RD │    │ 1000 │ 6F │ 0009 │ 5655 │ 1000 │ sZyHxVnC │
...
│11/1 │    │      │      │    │    │ 1000 │ 34 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │
│12/0 │    │      │      │    │    │ 1000 │ 34 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │ <== 4 extra clock cycles
...
│15/1 │    │      │      │    │    │ 1000 │ 34 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │
│16/0 │    │      │      │    │    │ 1000 │ 34 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │ <== memory write
│16/1 │    │ MREQ │      │    │    │ 1000 │ 46 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │
...
│18/1 │    │      │      │    │    │ 1000 │ 46 │ 0009 │ 5655 │ 1000 │ sZyHxVnC │

However, in the generated tests, the memory write is preceded by only 1 extra clock cycle, with the remaining 3 occurring after it:

/* memory read */
      [
        42625,
        null,
        "----"
      ],
      [
        42625,
        null,
        "r-m-"
      ],
      [
        42625,
        131,
        "----"
      ],
/* 1 extra cycle */
      [
        null,
        null,
        "----"
      ],
/* memory write */
      [
        42625,
        null,
        "----"
      ],
      [
        42625,
        63,
        "-wm-"
      ],
      [
        42625,
        null,
        "----"
      ],
/* 3 remaining extra cycles */
      [
        null,
        null,
        "----"
      ],
      [
        null,
        null,
        "----"
      ],
      [
        null,
        null,
        "----"
      ]
redcode commented 1 week ago

Thanks for notifying this. I'm fixing the test generator. I may port it to C and use my Z80 core to generate the tests. The cycles are wrong in several aspects, such as the ones you mentioned.

Corrections to these tests will be made soon.

BTW, we do not currently check "cycles" in the validation workflow.