Open ghost opened 2 years ago
Donkey Kong Country might also be a good test for this. When I make this change, the green-wireframe Rare intro animates correctly, though it displays garbage on the left side of the screen and other screens get corrupted. It also becomes very sluggish at intervals... maybe that's the overhead of calling DoDma() every scanline.
Also worth noting, in Pokemon Crystal the screen doesn't corrupt in the intro with Prof. Oak, but it does when you open a textbox in the overworld.
Basic idea for a new HDMA system. io.s: FF55_W: HDMA call comes in. Set _dma_blocks_remaining and _dma_blocks_total (replace _doing_hdma) lcd.s: entermode0: Decrement _dma_blocks_remaining. If _dma_blocks_remaining == 0, DoDma(_dma_blocks_total << 4) Back in io.s: FF55_W: Cancel HDMA transfer. Call DoDma((_dma_blocks_total - _dma_blocks_remaining) << 4)
Would be lighter on DoDma() calls while responding to games like Crystal canceling transfers.
EDIT: Saw new commit. Rebasing.
Also interesting is that "entermode0" only seems to run 1-2 times per frame total (checked through no$gba debugger). Had to modify timeout.s
next:
default_scanlinehook:
checkScanlineIRQ:
default_scanlinehook_nohblank:
mov r0,#16
ldr r1,=_doing_hdma
ldr r1,[r1]
cmp r1,#0xFF
stmfd sp!,{r3,r8-r12,r14,lr}
blxeq_long DoDma @ Call DoDma if we're doing HDMA
ldmfd sp!,{r3,r8-r12,r14,lr}
_checkScanlineIRQ:
tst cycles,#CYC_LCD_ENABLED
Speed-wise, I suppose hdma creates too many small gba transfers and overloads the time window, creating lag? Hdma is more annoying than I expected.
Rare wireframe intro is drawn using plain dma transfers during vblank; uses hdma to clear the screen before it. So I'm really curious why it was broken before.
Beginning to wonder what "entermode0" really does.
I think DKC soft locks because hdma is taking too long to finish so there must be something more to it than I know. Have to check the LYC counter.
DKC Rare Logo garbage -- GBC tilemap looks okay. So I think those 4 right-side tiles are not being marked dirty and treated on GBA tilemap side (0000 instead of 0058).
FF55_R: @HDMA5
ldrb_ r0,dma_blocks_remaining
ldrb_ r1,doing_hdma
cmp r1,#0xFF
subeq r0,r0,#1 @ If hdma, subtract 1
mov pc,lr
DKC responds better to this fix.
Shantae has a problem with hdma per line via default_scanlinehook. But DKC likes it. Mmmmmm....
io.s
@r0 = dest, r1 = src, r2 = byteCount, r3 = dirtyMapBits
global_func copy_map_and_compare
copy_map_and_compare:
stmfd sp!,{r4,r5,r6,r7,r8,r9,r10,r11}
cmc_loop1_left:
mov r12,#0
tst r0,#0x10
bne cmc_loop1_right
ldmia r0!,{r4,r5,r6,r7}
ldmia r1!,{r8,r9,r10,r11}
eors r4,r4,r8
strne r8,[r0,#-16]
orrne r12,r12,#0x01
eors r5,r5,r9
strne r9,[r0,#-12]
orrne r12,r12,#0x02
eors r6,r6,r10
strne r10,[r0,#-8]
orrne r12,r12,#0x04
eors r7,r7,r11
strne r11,[r0,#-4]
orrne r12,r12,#0x08
subs r2,r2,#16
beq cmc_loop1_exit
cmc_loop1_right:
ldmia r0!,{r4,r5,r6,r7}
ldmia r1!,{r8,r9,r10,r11}
eors r4,r4,r8
strne r8,[r0,#-16]
orrne r12,r12,#0x10
eors r5,r5,r9
strne r9,[r0,#-12]
orrne r12,r12,#0x20
eors r6,r6,r10
strne r10,[r0,#-8]
orrne r12,r12,#0x40
eors r7,r7,r11
strne r11,[r0,#-4]
orrne r12,r12,#0x80
cmc_loop1_exit:
ldrb r4,[r3]
orr r12,r12,r4
strb r12,[r3],#1
subs r2,r2,#16
bmi cmc_part2
bne cmc_loop1_left
cmc_part2:
ldmfd sp!,{r4,r5,r6,r7,r8,r9,r10,r11}
bx lr
adds r2,r2,#16
ldmlefd sp!,{r4,r5,r6,r7,r8,r9,r10,r11}
bxle lr
b_long _cmc_part2_
.pushsection .text
_cmc_part2_:
ble cmc_done
mov r6,#1
cmc_loop2:
ldr r4,[r0],#4
ldr r5,[r1],#4
eors r4,r4,r5
strne r5,[r0,#-4]
orrne r12,r12,r6
mov r6,r6,lsl#1
subs r2,r2,#4
bgt cmc_loop2
ldrb r4,[r3]
orr r12,r12,r4
strb r12,[r3],#1
cmc_done:
ldmfd sp!,{r4,r5,r6,r7,r8,r9,r10,r11}
bx lr
.popsection
Updates dirty tilemaps per 16 bytes (hdma). DKC now looks much better, but not perfect.
Could you take a look at the hdma2 branch and let me know what you think, if I'm on the right track there? DKC works, Crystal's busted, sprites in Shantae are flickery but recognizable.
It looks promising (!) and easier to follow but I'll have to do some debugging to check the internals; Goomba never behaves the way I'd expect it to.
For DKL Color, I think we need to "encodePC" inside _FF70W after the memmap switch (due to D000 bank). Haven't gotten it to work yet.
I understand your hdma 1-shot optimization now. Clever!
cancel_hdma:
stmfd sp!,{r0-r4,lr}
ldrb_ r0,dma_blocks_total
ldrb_ r1,dma_blocks_remaining
sub r0,r0,r1
lsls r0,r0,#4
blxne_long DoDma
ldmfd sp!,{r0-r4,lr}
We need to reset flags (lsls), then Crystal works.
DKL New Colors - menu crash fix
@----------------------------------------------------------------------------
_FF70W:@ SVBK - CGB Mode Only - WRAM Bank
@----------------------------------------------------------------------------
...
ldr r1,=wram_W
str_ r1,writemem_tbl+52
wram_remap_pc:
ldr_ r1,lastbank
sub gb_pc,gb_pc,r1
stmfd sp!,{r0}
encodePC
ldmfd sp!,{r0}
mov pc,lr
select_gbc_ram:
...
ldr r1,=wram_W_2
str_ r1,writemem_tbl+52
b wram_remap_pc
DKC title colors are wrong because of a Rare trick:
Goomba would have to constantly apply the updates per scanlines on GBA hardware for a proper fix
Hm. The DKC title colors seem like low priority to me, it's still readable and everything else looks fine. I'll push the Crystal and DKL fixes soon and get a new release out.
Low priority = yes, definitely.
I have an idea for DKC but not familiar with GBA hardware:
Then monitor the GBA scanlines as it renders the frame and update GBA palettes based on cached palette list.
I wonder why Shantae bugs out when hacks are disabled but that's not important; could be some ugly timing racing issue.
Otherwise I guess we can close this ticket and reopen later if needed.
Checked out the latest code branch.
I was misinformed by a document. HDMA always transfers 16 bytes per line (checked bgb). It just takes ""1/2 the time"".
I think the branch is backwards.
==>
Shantae sprites are still partly bugged on my end though.
EDIT: I still could be wrong and also investigating.