Open javiergutierrezchamorro opened 10 years ago
Thanks for usuful note for 8086, where we can get gain one clock per each of these instructions by this replacement.
Not also they are faster in 286 or lower, but also they are smaller in all plattforms, meaning that they better fit in cache, resulting in usual faster code too in newer architechtures such as 486.
Hope you will be able to extend this optimization to all the ASM code in Openwatcom, in order to better take advantage of it.
I have created a simple patch with some updates. If the format is OK to you, and you are able to integrate it with no issue, I will continue updating it.
It is available at: http://www.javiergutierrezchamorro.com/_temp/upload/ow-patch.diff
Looking forward to hear from you.
Best will be if you do your own fork of OW2.0 on GitHub and do changes there. Next you do Pull request to OW2.0 repository. We can work on changes together with comments etc. You can also simply test your changes.
Anyway your changes should be done without commenting old code. Simply replace old code by new one.
Be carefull, in code(bld/causeway/asm/cw32.asm) bellow is mistake. cw16_GrabPage: mov ah,43h ;Allocate pages mov bx,1 ;Get 1 page (16K) int 67h ;cmp ah,0 ;Was there an error? test ah,ah jne cw16_GPErr ;Yes, so exit mov ah,45h ;Release EMS handle int 67h ;cmp ah,0 ;Was there an error? test al,al jne cw16_GPErr ;Yes, so exit clc ;Mark for no error jmp cw16_GPEnd cw16_GPErr:
Right. I see the error. Missed ah and al. I will need to create an automated script or something, because replacing them manually is quite error prone.
I think it is not possible to do automation for these changes, because xor operation change CPU flags. It will require some manual revision of changes.
First batch is ready. I have created the pull request. When it is OK, I will continue with more optimizations.
Forget about it. i noticed some flags affected, so I will fix them, and create a new pull request when OK.
It is now fixed. Please let me know when OK, and I will continue with the updates.
Pull request available here: https://github.com/open-watcom/open-watcom-v2/pull/96
Thank you for integrating it. https://github.com/open-watcom/open-watcom-v2/pull/96
In recent changes I noticed some ASM constructions to set registers to zero, or compare them, in the form:
cmp reg, 0 mov reg, 0
It is compact, and usually faster using: test reg, reg xor reg, reg