SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
29 stars 7 forks source link

Loop optimization (WIP) #60

Closed patricksurry closed 7 months ago

patricksurry commented 8 months ago

this is still work in progress but in a consistent state with tests passing if you want to take a peek.

so far it reorganizes zero page a little, making space for loop control (mostly unused so far) and refactors leave to get it off the return stack. this simplifies the runtime so we just have simple jmp's

next task is to route the loop vars thru zp so to simplify the increment and index words.

SamCoVT commented 8 months ago

I think you've cleaned out enough zp space you could put I and J in zero page. I gave that some thought, but I don't think it's worth it - you'd have to shuffle J to the return stack and then I to J for each new loop (and then back at the end).

It looks like you are all set up to make the change. The cycle tests should let you know if it makes sense to go this route (I think it does) and the regular tests will let you know if you accidentally bork something.

patricksurry commented 8 months ago

OK, this is a working state (but still a little doc cleanup etc). The ROM is slightly shorter, and unnested loops are decently faster (due to avoiding push/pull of loop control in that case). i is faster. But nested loops are only a little better (probably would be a slight win overall to remove the unnested loop specialization which would make those slower but avoid the check on each repeat of a nested one).

Just for fun I'm going to try something a bit different which I think will a fair bit quicker and simpler.

patricksurry commented 6 months ago

see #53