open-watcom / open-watcom-v2

Open Watcom V2.0 - Source code repository, Wiki, Latest Binary build, Archived builds including all installers for download.
Other
942 stars 157 forks source link

Loop optimization causes invalid results in far pointer arithmetic #1037

Open ttalvitie opened 1 year ago

ttalvitie commented 1 year ago

Summary

When enabling loop optimizations (flag -ol) and performing 16-bit mode far pointer arithmetic in specific circumstances (see code below), the segment value of the resulting far pointer is incorrect.

Environment

2023-01-01 Build of Open Watcom V2; DOS version (installed by choosing defaults in the installer) Running in DOSBox 0.74

Steps to reproduce

Create file code.c:

#include <i86.h>
#include <stdio.h>

int zero = 0;

int main(int argc, char* argv[]) {
    char __far* vgaMem = (char __far*)0xA0000000L;

    int x;
    for(x = zero; x < 1; ++x) {
        char __far* ptr = vgaMem + x;
        printf("(%u; %u) + %d = (%u; %u)\n", FP_SEG(vgaMem), FP_OFF(vgaMem), x, FP_SEG(ptr), FP_OFF(ptr));
    }

    return 0;
}

Run these commands:

autoexec.bat
wcc code.c -d1
wlink system dos name code.exe file code.obj
code.exe
wdis -l=code1.asm -s=code.c code.obj
wcc code.c -d1 -ol
wlink system dos name code.exe file code.obj
code.exe
wdis -l=code2.asm -s=code.c code.obj

Expected result

Both code.exe invocations output (40960; 0) + 0 = (40960; 0). This is because the loop body is executed exactly once with x = 0, and thus ptr should be just equal to vgaMem.

Observed result

The first code.exe invocation outputs (40960; 0) + 0 = (40960; 0) as expected, but the second invocation (after recompiling with -ol) outputs (40960; 0) + 0 = (0; 0), i.e. FP_SEG(ptr) is 0 instead of the expected value 40960.

(There are no compiler or linker warnings.)

Analysis

Consider the disassembly files CODE1.ASM and CODE2.ASM generated by wdis. In CODE1.ASM, all local variables are stored in registers, but in CODE2.ASM (compiled with the -ol flag), main reserves 4 bytes of stack space, which is used in a weird way: the first variable -0x4[bp] is only written to, while the second variable -0x2[bp] is only read from; the read value is used directly as FP_SEG(ptr). Thus it seems like the optimization causes the code that would write 0xA000 to -0x2[bp] to be removed, causing a read of an uninitialized stack value.

If we change x to volatile or move zero to a local variable, the result becomes correct again. If we move zero to a local variable and make it volatile, then we get a different incorrect output (40960; 0) + 0 = (955; 0).

This bug also reproduces on the old Open Watcom v1.9.

jmalak commented 1 year ago

Thank you for your bug report and your analysis. It looks like loop optimization is somehow broken from previous versions. Strange is also use of MUL instruction for constant value 1.

I enclosed compilation result from WATCOM 11.0

.387
                PUBLIC  main_
                PUBLIC  _zero
                EXTRN   printf_:BYTE
                EXTRN   __argc:BYTE
                EXTRN   _small_code_:BYTE
                EXTRN   _cstart_:BYTE
DGROUP          GROUP   CONST,CONST2,_DATA,_BSS
_TEXT           SEGMENT BYTE PUBLIC USE16 'CODE'
                ASSUME CS:_TEXT, DS:DGROUP, SS:DGROUP
main_:
        push            bx
        push            cx
        push            bp
        mov             bp,sp
        sub             sp,4
        mov             bx,word ptr _zero
        mov             ax,bx
        mov             dx,1
        mul             dx
        mov             word ptr [bp-4],ax
        mov             dx,ax
        mov             cx,word ptr [bp-2]
L$1:
        cmp             bx,1
        jge             L$2
        push            dx
        push            cx
        push            bx
        xor             ax,ax
        push            ax
        mov             ax,0a000H
        push            ax
        mov             ax,offset DGROUP:L$3
        push            ax
        call            near ptr printf_
        add             sp,0cH
        inc             dx
        inc             bx
        jmp             L$1
L$2:
        xor             ax,ax
        mov             sp,bp
        pop             bp
        pop             cx
        pop             bx
        ret
_TEXT           ENDS
CONST           SEGMENT WORD PUBLIC USE16 'DATA'
L$3:
    DB  28H, 25H, 75H, 3bH, 20H, 25H, 75H, 29H
    DB  20H, 2bH, 20H, 25H, 64H, 20H, 3dH, 20H
    DB  28H, 25H, 75H, 3bH, 20H, 25H, 75H, 29H
    DB  0aH, 0

CONST           ENDS
CONST2          SEGMENT WORD PUBLIC USE16 'DATA'
CONST2          ENDS
_DATA           SEGMENT WORD PUBLIC USE16 'DATA'
_zero:
    DB  0, 0

_DATA           ENDS
_BSS            SEGMENT WORD PUBLIC USE16 'BSS'
_BSS            ENDS

                END

It looks like very very old bug in WATCOM compiler still not fixed.

javiergutierrezchamorro commented 1 month ago

Any chances of taking care of it @jmalak ? I guess this will improve performance slightly.

javiergutierrezchamorro commented 1 month ago

It continues here: https://github.com/open-watcom/open-watcom-v2/issues/81