tkchia / gcc-ia16

Fork of Lambertsen & Jenner (& al.)'s IA-16 (Intel 16-bit x86) port of GNU compilers ― added far pointers & more • use https://github.com/tkchia/build-ia16 to build • Ubuntu binaries at https://launchpad.net/%7Etkchia/+archive/ubuntu/build-ia16/ • DJGPP/MS-DOS binaries at https://gitlab.com/tkchia/build-ia16/-/releases • mirror of https://gitlab.com/tkchia/gcc-ia16
GNU General Public License v2.0
173 stars 13 forks source link

Missed optimisation: add immediate after lea #95

Open ecm-pushbx opened 2 years ago

ecm-pushbx commented 2 years ago

The example source that produces this code is at https://pushbx.org/ecm/test/20211114/test.c

It is taken from https://hg.pushbx.org/ecm/interc3/file/77616b6c4040/INTERPRE.C#l184 (The interpre source is in the Public Domain, intercep component under Fair License.)

While trying to shrink the required example code the register allocation seemed to change so that the code in question wasn't produced any longer. That's why the test.c file has the entire interpret_file function.

The C source is compiled as ia16-elf-gcc -Wall -fpack-struct -mcmodel=small -Os test.c -masm=intel -S -o test.s

This is the relevant code:

    swi_info rec;
    swi_info_amis * amis;
    ...
        if (format && (rec.intnum & 0xFF00) != 0) {
            switch (rec.intnum) {
            case 0x100:
                amis = (swi_info_amis *)&rec;
                fprintf(ofp, "Multiplexer replied:"
                    " %04X:%04X -> \"%8.8s\" \"%8.8s\""
                    " version %04X\n",
                    amis->segment, amis->offset,
                    amis->vendor, amis->product,
                    amis->version);

This is the assembly generated from this source:

.L14:
    cmp word ptr [bp+20],   0
    je  .L5
    mov ax, word ptr [bp-402]
    test    ah, -1
    je  .L5
    cmp ax, 256
    jne .L4
    push    word ptr [bp-406]
    lea ax, [-426+bp]
    add ax, 8
    push    ax
    lea ax, [-426+bp]
    push    ax
    push    word ptr [bp-410]
    push    word ptr [bp-408]
    mov ax, offset .LC2
    push    ax
    push    word ptr [bp+12]
    call    fprintf
    add sp, 14
    jmp .L4

The part that I made this feature request for is:

    lea ax, [-426+bp]
    add ax, 8
    push    ax

The add with an immediate could be folded into the lea.

Bonus optimisation:

    test    ah, -1
    je  .L5

I believe test ah, ah would be one byte shorter.

ecm-pushbx commented 2 years ago

This is the header containing the packed structures used by the source: https://hg.pushbx.org/ecm/interc3/file/77616b6c4040/intercep.h

#include <stdint.h>

/* what we record in our memory block about each interrupt */
typedef struct __attribute__ ((__packed__)) {
    uint16_t bp, di, si, ds, es, dx;
    uint16_t cx, bx, ax;
    uint16_t ip;
    uint16_t cs;
    uint16_t flags;
    uint16_t intnum;
} swi_info;

typedef struct __attribute__ ((__packed__)) {
    uint8_t vendor[8];
    uint8_t product[8];
    uint16_t offset;
    uint16_t segment;
    uint16_t version;
    uint16_t reserved;
    uint16_t intnum;
} swi_info_amis;