TomHarte / CLK

A latency-hating emulator of: the Acorn Electron and Archimedes, Amstrad CPC, Apple II/II+/IIe and early Macintosh, Atari 2600 and ST, ColecoVision, Enterprise 64/128, Commodore Vic-20 and Amiga, MSX 1/2, Oric 1/Atmos, early PC compatibles, Sega Master System, Sinclair ZX80/81 and ZX Spectrum.
MIT License
937 stars 52 forks source link

Fix Apple II floating bus more #1196

Closed ryandesign closed 11 months ago

ryandesign commented 11 months ago

I'm sorry, I must have made a mistake while refactoring my code before submitting #1182 because although "Have an Apple Split" and "Rainbow" now work in Clock Signal 23.10.29, its Apple II floating bus behavior is still not right. Don Lancaster's "Vaporlock", for example, goes into an infinite loop instead of getting a lock. This is just a placeholder report to remind me to investigate and submit another fix.

ryandesign commented 11 months ago

I updated my program from #1180 that samples the floating bus so that it starts at exactly the same point after detecting the start of vbl so that there is no longer a 7-cycle jitter to account for when comparing different runs.

Here is the updated program source. ```asm ; SPDX-FileCopyrightText: © 2023 Ryan Carsten Schmidt ; SPDX-License-Identifier: MIT ;save as fbtest.s and assemble and link with: ;cl65 -t apple2 -C apple2-asm.cfg --start-addr 16384 -u __EXEHDR__ -o fbtest fbtest.s GBASL = $26 ;graphics base address low byte GBASH = $27 ;graphics base address high byte A1L = $3C ;general purpose A1 register low byte A1H = $3D ;general purpose A1 register high byte A2L = $3E ;general purpose A2 register low byte A2H = $3F ;general purpose A2 register high byte A3L = $40 ;general purpose A3 register low byte A3H = $41 ;general purpose A3 register high byte A4L = $42 ;general purpose A4 register low byte A4H = $43 ;general purpose A4 register high byte HGRPAGE = $E6 ;hires drawing base address high byte RDVBLBAR = $C019 ;vertical blanking flag TXTCLR = $C050 ;graphics TXTSET = $C051 ;text MIXCLR = $C052 ;no split MIXSET = $C053 ;split LOWSCR = $C054 ;page 1 HISCR = $C055 ;page 2 LORES = $C056 ;lores HIRES = $C057 ;hires IDBYTE1 = $FBB3 ;machine identification byte 1 IDBYTE2 = $FBC0 ;machine identification byte 2 WAIT = $FCA8 ;delay (26+27*A+5*A*A)/2 cycles (A>0) NXTA1 = $FCBA ;increment A1 routine IDROUTINE = $FE1F ;machine identification routine RESET = $FF59 ;reset and go to monitor ROWS = 192 ;number of hires rows HOLESTARTROW= 128 ;first hires row that's followed by holes BYTESPERROW = 40 ;bytes per row HOLESPERROW = 8 ;holes per row for rows that have them PAGE1H = $20 ;hires page 1 base address high byte DATA = $1000 ;generated data start address DATALEN = $800 ;generated data length DATAEND = DATA+DATALEN-1 ;generated data end address OPLDAABS = $AD ;lda (absolute addressing) opcode OPSTAABS = $8D ;sta (absolute addressing) opcode OPRTS = $60 ;rts opcode .proc main bit MIXCLR ;no split bit LOWSCR ;show page 1 bit HIRES ;hires bit TXTCLR ;show graphics lda #PAGE1H ;load hires page 1 base address high byte sta HGRPAGE ;store it so we draw on page 1 jsr hiresfill ;fill hires screen lda #DATA ;load data start address high byte sta A1H ;store in A1H lda #DATAEND ;load data end address high byte sta A2H ;store in A2H lda #prog ;load program start address high byte sta A3H ;store in A3H jsr genprog ;generate program to sample floating bus jsr runprog ;run generated program jmp RESET ;go to monitor .endproc ;fill the hires screen: rows filled with 0-191; holes with 192-255 ;input: HGRPAGE = high byte of the page address .proc hiresfill ldx #ROWS-1 ;load y coordinate into X @rowloop: txa ;transfer y coordinate to A jsr hiresrowaddr ;get memory address in GBASL,GBASH txa ;transfer y coordinate to A ldy #BYTESPERROW-1 ;load byte offset into Y @byteloop: sta (GBASL),Y ;store y coordinate in screen byte dey ;decrement byte offset bpl @byteloop ;loop for each byte cpx #HOLESTARTROW ;check if y coordinate is row with holes bcc @nextrow ;no holes on this row adc #63 ;update value to store in hole ldy #BYTESPERROW+HOLESPERROW-1 ;load byte offset into Y @holeloop: sta (GBASL),Y ;store value in screen hole dey ;decrement byte offset cpy #BYTESPERROW-1 ;check if byte offset reached the end bne @holeloop ;loop for each hole @nextrow: dex ;decrement y coordinate cpx #$FF ;check if y coordinate reached the end bne @rowloop ;loop for each row rts ;return .endproc ;compute the address of the hires row ;based on the first part of HPOSN in the Apple II+ ROM ;input: HGRPAGE = high byte of page address, A = y coordinate ;output: GBASL,GBASH = row address .proc hiresrowaddr sta GBASH ;save y coordinate in GBASH and #%11000000 ;retain high two bits of A sta GBASL ;store A in GBASL lsr A ;shift A right lsr A ;shift A right ora GBASL ;AND A with GBASL sta GBASL ;store A in GBASL lda GBASH ;restore y coordinate from GBASH asl A ;shift A left asl A ;shift A left asl A ;shift A left rol GBASH ;rotate GBASH left asl A ;shift A left rol GBASH ;rotate GBASH left asl A ;shift A left ror GBASL ;rotate GBASL right lda GBASH ;load GBASH into A and #%00011111 ;retain low five bits of A ora HGRPAGE ;combine with hires page base address sta GBASH ;store A in GBASH rts ;return .endproc ;generate a program that samples the floating bus every 8 cycles ;input: A1L,A1H = data start address, A2L,A2H = data end address, ;A3L,A3H = program start address .proc genprog clc ;clear carry @loop: ldy #0 ;load 0 into offset lda #OPLDAABS ;load lda (absolute) opcode into A sta (A3L),Y ;store in program iny ;increment offset lda #HIRES ;load hires soft switch address high byte sta (A3L),Y ;store in program iny ;increment offset lda #OPSTAABS ;load sta (absolute) opcode into A sta (A3L),Y ;store in program iny ;increment offset lda A1L ;load data address low byte sta (A3L),Y ;store in program iny ;increment offset lda A1H ;load data address high byte sta (A3L),Y ;store in program lda A3L ;load A3 low byte adc #6 ;increment by the number of bytes stored above sta A3L ;store it back in A3 low byte bcc @next ;if carry is still clear, skip A3H increment inc A3H ;increment A3 high byte @next: jsr NXTA1 ;increment A1 bcc @loop ;loop if A1 hasn't reached A2 ldy #0 ;load 0 into offset lda #OPRTS ;load rts opcode into A sta (A3L),Y ;store in program rts ;return .endproc ;run the generated program ;if running on a IIe, wait for the vertical blanking interval to begin .proc runprog sec ;set carry before identification routine jsr IDROUTINE ;run machine identification routine bcc prog ;if carry was cleared it's a IIgs lda IDBYTE1 ;load machine ID byte 1 cmp #6 ;check for IIe or better bne prog ;it's a II, II+, or III in II+ emulation lda IDBYTE2 ;load machine ID byte 2 beq prog ;it's a IIc or IIc+ @loop1: bit RDVBLBAR ;wait for vertical blanking interval to end bpl @loop1 ; @loop2: bit RDVBLBAR ;wait for vertical blanking interval to begin bmi @loop2 ;2 + 1 when taken ;one complete frame is 65*(192+70)=17030 cycles; ;find exact vbl beginning using 17029-cycle loop lda $00 ;3 @loop3: lda #74 ;2 jsr WAIT ;14702 lda #24 ;2 jsr WAIT ;1777 lda #12 ;2 jsr WAIT ;535 bit RDVBLBAR ;4 nop ;2 bpl @loop3 ;2 + 1 when taken ;fall through to prog .endproc ;the generated program .proc prog ;self-modifying! genprog overwrites bytes rts ;starting here .endproc ```

You can poke it into memory by entering the monitor with:

CALL -151

and then pasting this in:

4000:2C 52 C0 2C 54 C0 2C 57 C0 2C 50
:C0 A9 20 85 E6 20 34 40 A9 00 85 3C
:A9 10 85 3D A9 FF 85 3E A9 17 85 3F
:A9 E2 85 40 A9 40 85 41 20 79 40 20
:AF 40 4C 59 FF A2 BF 8A 20 57 40 8A
:A0 27 91 26 88 10 FB E0 80 90 0B 69
:3F A0 2F 91 26 88 C0 27 D0 F9 CA E0
:FF D0 E0 60 85 27 29 C0 85 26 4A 4A
:05 26 85 26 A5 27 0A 0A 0A 26 27 0A
:26 27 0A 66 26 A5 27 29 1F 05 E6 85
:27 60 18 A0 00 A9 AD 91 40 C8 A9 57
:91 40 C8 A9 C0 91 40 C8 A9 8D 91 40
:C8 A5 3C 91 40 C8 A5 3D 91 40 A5 40
:69 06 85 40 90 02 E6 41 20 BA FC 90
:D2 A0 00 A9 60 91 40 60 38 20 1F FE
:90 2D AD B3 FB C9 06 D0 26 AD C0 FB
:F0 21 2C 19 C0 10 FB 2C 19 C0 30 FB
:A5 00 A9 4A 20 A8 FC A9 18 20 A8 FC
:A9 0C 20 A8 FC 2C 19 C0 EA 10 EB 60

Run it with:

4000G

After running it, the first few rows of data gathered from the first vbl scanlines can be shown with:

1000.103F

Output on my real unenhanced Apple IIe:

1000- 80 80 80 C0 00 00 00 00
1008- 81 81 81 C1 01 01 01 01
1010- 82 82 82 C2 02 02 02 02
1018- 83 83 83 C3 03 03 03 03
1020- 84 84 84 C4 04 04 04 04
1028- 85 85 85 C5 05 05 05 05
1030- 86 86 86 86 C6 06 06 06
1038- 06 87 87 87 C7 07 07 07

Output from Virtual ][ 11.4:

1000- 80 80 80 C0 00 00 00 00
1008- 81 81 81 C1 01 01 01 01
1010- 82 82 82 C2 02 02 02 02
1018- 83 83 83 C3 03 03 03 03
1020- 84 84 84 C4 04 04 04 04
1028- 85 85 85 C5 05 05 05 05
1030- 86 86 86 C6 06 06 06 06
1038- 87 87 87 87 C7 07 07 07

Virtual ]['s output is close to correct but there's a slight timing difference. (Real IIe showed 4 $86's starting at $1030; Virtual ][ showed 4 $87's at $1038.)

Output from OpenEmulator 1.1.1-202203110628:

1000- 80 C0 00 00 00 00 81 81
1008- 81 C1 01 01 01 01 82 82
1010- 82 C2 02 02 02 02 83 83
1018- 83 83 C3 03 03 03 03 84
1020- 84 84 C4 04 04 04 04 85
1028- 85 85 C5 05 05 05 05 86
1030- 86 86 C6 06 06 06 06 87
1038- 87 87 C7 07 07 07 07 88

OpenEmulator's timing is way off (4 $83's at $1016).

Output from Clock Signal 23.09.10:

1000- 84 90 D0 00 00 00 00 00
1008- B8 B8 F8 01 01 01 01 01
1010- B9 B9 F9 02 02 02 02 02
1018- BA BA FA 03 03 03 03 03
1020- BB BB FB 04 04 04 04 04
1028- BC BC FC 05 05 05 05 05
1030- BD BD FD 06 06 06 06 06
1038- BE BE BE FE 07 07 07 07

Timing matches Virtual ][ and the wrong values were mostly fixed by #1182.

Output from Clock Signal 23.10.29:

1000- 80 80 80 D0 00 00 00 00
1008- 81 81 81 F8 01 01 01 01
1010- 82 82 82 F9 02 02 02 02
1018- 83 83 83 FA 03 03 03 03
1020- 84 84 84 FB 04 04 04 04
1028- 85 85 85 FC 05 05 05 05
1030- 86 86 86 FD 06 06 06 06
1038- 87 87 87 87 FE 07 07 07

Better except I still have an error in the last eight bytes of hbl during vbl.

Output from Clock Signal with the bug fix that I will submit shortly is now identical to Virtual ][:

1000- 80 80 80 C0 00 00 00 00
1008- 81 81 81 C1 01 01 01 01
1010- 82 82 82 C2 02 02 02 02
1018- 83 83 83 C3 03 03 03 03
1020- 84 84 84 C4 04 04 04 04
1028- 85 85 85 C5 05 05 05 05
1030- 86 86 86 C6 06 06 06 06
1038- 87 87 87 87 C7 07 07 07