AppleWin / AppleWin

Apple II emulator for Windows
GNU General Public License v2.0
724 stars 166 forks source link

Timing problems with ASL, INC and DEC (abs,X) #271

Closed Archange427 closed 9 years ago

Archange427 commented 9 years ago

I think I found another timing problems with these instructions : ASL abs,X (opcode $1E) DEC abs,X (opcode $DE) INC abs,X (opcode $FE)

All of them are 7 cycles (not 6) with a 6502. references : http://archive.6502.org/datasheets/mos_6500_mpu_preliminary_may_1976.pdf http://archive.6502.org/datasheets/rockwell_r650x_r651x.pdf

Warning : in Sather's "Understanding the Apple //e", tables 4.1/4.2, it seems that these 3 instructions are referenced as 6 cycles only (likely, I don't read/understand very well table 4.1).

But according to datasheets (see above), Eyes/Lichty's Book and (especially) observations on a REAL Apple IIe (6502), I think these 3 opcodes use actually 7 cycles.

For 65C02 : According to Eyes/Lichty's book ("Programming the 65816..."), on 65C02, ASL abs,X DEC abs,X INC abs,X would be 6 cycles only ("substract 1 cycle if 65C02 and no page boundary crossed") But datasheets for 65C02 didn't say the same : 7 cycles for all of them (all the time, no page boundary problem here). -> ref for 65C02 here : http://archive.6502.org/datasheets/rockwell_r65c00_microprocessors.pdf)

Sorry but I cannot test currently with a real Apple IIe 65C02.

If someone could confirm...

Regards Arnaud

tomcw commented 9 years ago

Thanks for bringing this to our attention.

IMO, empirical data should be measured on real 6502 and 65C02 CPUs (instead of relying on contradicting datasheets and books).

If anyone tests on real h/w then please report your findings here.

Archange427 commented 9 years ago

Hi Tom, Initially, it's a test on real Apple IIe (6502 version) that revealed the problem. Then I looked further into datasheets and books to verify my observations.. So, I can confirm (empirically) that ASL, INC and DEC (abs,X) are 7 cycles with a 6502.

Alas, I can't do the same test with 65C02.

tomcw commented 9 years ago

Hi Arnaud, Thanks for confirming. I'll run some tests next time I have my machines out.

ethereal-esthesia commented 9 years ago

I recently ran nearly every Opcode on the 65C02 (real hardware) using the vertical retrace as a timer and got the same unexpected results for 2 of the 3 mentioned.

ASL abs,X (opcode $1E) - 6 cycles +1 if crossing page boundary DEC abs,X (opcode $DE) - 7 cycles always INC abs,X (opcode $FE) - 7 cycles always

I'd be happy to share any other results as well if needed.

sicklittlemonkey commented 9 years ago

INC abs,X and DEC abs,X were not optimized on the 65C02 - always 7 cycles.

Other read-modify-write instructions (e.g. ASL abs,X) should be 6 (+1 for PX) on 65C02; always 7 on NMOS 6502.

Cheers, Nick.

On 12 April 2015 at 17:43, Shane Reilly notifications@github.com wrote:

I recently ran nearly every Opcode on the 65C02 (real hardware) using the vertical retrace as a timer and got the same unexpected results for 2 of the 3 mentioned.

ASL abs,X (opcode $1E) - 6 cycles +1 if crossing page boundary DEC abs,X (opcode $DE) - 7 cycles always INC abs,X (opcode $FE) - 7 cycles always

I'd be happy to share any other results as well if needed.

— Reply to this email directly or view it on GitHub https://github.com/AppleWin/AppleWin/issues/271#issuecomment-91995232.

Archange427 commented 9 years ago

cursorcorner : your other results on a real 65C02 interest me ! And I think they are related to this topic. Can you copy them here please ?

tomcw commented 9 years ago

Since I was looking at #264, I decided to run similar experiments:

300: A2 00 AD 04 C4 DE FF 20 AE 04 C4 00

.ORG $300
 LDX #0
 LDA $C404
 DEC $20FF,X
 LDX $C404
 BRK

NB. Below, delta = (A-X) - 4 cycles for the LDX $C404

Apple v1.25.0.3:

Real h/w:

ethereal-esthesia commented 9 years ago

Archange427, the chart below is based on real hardware tests on an Apple IIe Platinum, 65C02 for all CPU operations other than break. I've been working on an emulator myself for over a decade now, so I have been meaning to put this together for a long while.

image

Archange427 commented 9 years ago

cursorcorner: Thank you very much!

Michaelangel007 commented 9 years ago

@cursorcorner Fantastic opcode table summary! I might have to print that off ! Very handy.

tomcw commented 9 years ago

@cursorcorner: Can you explain a few values in your table:

  1. Opcode=0x51: EOR ($FF),Y - Cycles hi=5
    • For ADC/SBC ($FF),Y - Cycles hi=7
    • But for all other IND_Y opcodes, eg. 0x11, Cycles hi=6
    • Shouldn't EOR ($FF),Y have Cycles hi=6 too?
  2. Opcode=0x69, ADC #$FF - Cycles lo=2,hi=3
    • Shouldn't Cycles lo be the same as hi? (ie. 2 for both)
  3. Opcode=0xE9, SBC #$FF - Cycles lo=2,hi=3
    • Same as 2?
  4. ADC opcodes differ in the Cycles lo vs hi times
    • The ones that can page-cross are but hi-lo=2 not 1 cycle
      • 0x71: ADC ($FF),Y
      • 0x79: ADC $FFFF,Y
      • 0x7D: ADC $FFFF,X
    • The other 6 ADC variants can't page-cross, but hi-lo=1 not 0 cycles
      • 0x61: ADC ($FF,X)
      • 0x65: ADC $FF
      • 0x69: ADC #$FF
      • 0x6D: ADC $FFFF
      • 0x72: ADC ($FF)
      • 0x75: ADC $FF,X
  5. SBC opcodes have the same timings as per 4 (above).

I just noticed these things as I have been comparing with AppleWin's opcode timings, and these are the 65C02 timings that stand out as different. Thanks.

tomcw commented 9 years ago

I cross-referenced with: http://archive.6502.org/datasheets/rockwell_r65c00_microprocessors.pdf ...and for items 2-5, if the 65C02 is in Decimal mode, then the cycle count is +1, which explains the values you have for ADC & SBC "Cycles hi".

But I think your EOR ($FF),Y cycles hi value is wrong (ie. item 1).

tomcw commented 9 years ago

Fixed in 614dfc1257cc7fc32fb344be27944f145cdd8935. Closing.

ethereal-esthesia commented 9 years ago

tomcw, absolutely the hi cycles are caused by the DEC flag being set for ADC and SBC. The test was run once with all significant flags set to 0 and then again with all flags set to 1. Then the times were taken by seeing how many loops could be iterated over in a single monitor frame for each opcode. The cycle counts were accurate to several decimal places, but they were transcribed by hand, so thank you for the cross-check!

As for EOR, I also found the results to be very strange and felt the need to recheck it a few times myself the first time I looked over the results, and again just now. The only explanation I have for it is that it may have been an oversight in the original architecture, or that it was intended to serve some sort of marketing advantage or a seemingly obscure programming advantage.

On a side note, AppleWin has been an inspiration and I have used it to test against often. Thank you for continuing its legacy.

tomcw commented 4 years ago

Testing EOR (zp),Y on 65C02 (part: 338-6503, YYWW=8637 from my //c and put into my //e)

FE: FF 20 300: A0 0 AE 4 C4 51 FE AD 4 C4 0

.ORG $300
 LDY #0
 LDX $C404
 EOR ($FE),Y
 LDA $C404
 BRK

Also test EOR ($FF),Y too: FF: FF 00: 20 300: A0 0 AE 4 C4 51 FF AD 4 C4 0

tomcw commented 4 years ago

Check ADC ($FE),Y: (op=$71)

Check SBC ($FE),Y: (op=$F1)

Check ORA ($FE),Y: (op=$11)