Factored FIND for speed

jkotlinski / durexforth

Modern C64 Forth

Other

230 stars 28 forks source link

Factored FIND for speed #426

Closed burnsauce closed 2 years ago

burnsauce commented 2 years ago

It's an O(1) improvement, but it is faster.

The find-name implementation is big and slow, but find is faster.

burnsauce commented 2 years ago

Note that this is a first-working commit at this point, I'm still testing and improving.

The general idea is that removing the well-factored forth structure from this area of code will improve performance. Each call to FIND used to perform a bunch of repeated operations with the dictionary pointer on the forth data stack. Several words, in turn, would take this value and index into the dictionary, forthily producing our desired XT and FLAG.

Now, W holds the dictionary pointer as the XT and FLAG are gathered. To facilitate this, FIND holds its functionality top-to-bottom.

FIND-NAME is now in forth, and is a terrible implementation.

burnsauce commented 2 years ago

Benchmark at this point is ~0.32 ms/FIND faster.

vice-screen-2022010917374480 vice-screen-2022010917401954

... which if my math is correct is 32 cycles.

jkotlinski commented 2 years ago

Sorry for not reviewing this yet! I should try to have a look at it soon.

jkotlinski commented 2 years ago

OK, I had a quick look at it now.

One thing made me a bit puzzled, and it is, why prefer FIND over FIND-NAME? FIND is a holdover from Forth-83, it is quite odd that it takes a counted string as input. If you look at the proposal for FIND-NAME, it suggests that FIND-NAME is a "natural factors of all words that look up words in the dictionary, such as FIND, ', POSTPONE, the text interpreter, ..." I agree with the proposal and think it makes sense to have FIND-NAME as a (fast!) primitive.

jkotlinski commented 2 years ago

Also... yes, I can see that there is some overhead with the FIND-NAME calls. But what is the performance impact in % ? 32 cycles does not sound like a lot, compared to the time it takes to search the full word list.

burnsauce commented 2 years ago

I agree that it's a marginal improvement that breaks the standard, and am happy if this PR is rejected.

jkotlinski commented 2 years ago

I am closing then, for reasons stated above. Thanks for the contribution anyway! Much appreciated!