skiselev / 8088_bios

BIOS for Intel 8088 based computers
GNU General Public License v3.0
513 stars 61 forks source link

delay_15us implementation using PIT polling #51

Closed 640-KB closed 11 months ago

640-KB commented 11 months ago

Here is an alternate implementation of delay_15us that polls the PIT counters for accurate delays that will work on non-AT systems. This may address https://github.com/skiselev/8088_bios/issues/50 and is in response to your https://github.com/skiselev/8088_bios/issues/50#issuecomment-1737737022. :)

Note: This is the (slightly modified) code I've used in GLaBIOS for such delays, so it is pretty well-tested. However, you'll want to give it a good amount of actual testing within your own codebase for sure.

skiselev commented 11 months ago

Thanks for your contribution again! My only concern with this change that for small values of delay (e.g. floppy code has 3 * 15 us delay), the overhead will be larger than the requested delay. But looking at the code it shouldn't hurt much.

640-KB commented 10 months ago

Yes, that occurred to me too, as I just used units of 1ms (to make sure any overhead error was insignificant). Always wondered exactly what that overhead might be and the minimal useful delay you could do in a callable proc. So... I did some tests using MartyPC where we can measure calls with clock cycle and time accuracy to get the exact overhead.

At 4.77MHz/8088, the CALL/RET overhead (including register PUSH/POP) for the PIT polling version of delay_15us is 97μs. The overhead in the other two implementations is similar at 86μs, because they only do two PUSH/POPs instead of four.

For the 45μs call, the PIT polling version takes about 180μs to complete, due to having to do three cycles of reading the timer. I couldn't test the AT_DELAY version since MartyPC doesn't support the AT style PPI refresh bit, though I'd estimate the accuracy may be a little bit better for the 45μs loop -- taking around 100-120μs depending on how you hit it. In the case of resetting the floppy controller, to your point, a very small extra delay is fine (better higher than lower).

On longer loops such as the 3000H * 15μs, that error goes way down to 0.63%. At 7.14MHz, the error will of course be further reduced, with 120μs for the 45μs, and 0.60% error for the 3000H loop. At that point, there wouldn't be any meaningful difference between this and the AT_DELAY version so we should be good!