[Binutils] Generate invalid zero words in function

jiegec commented 5 months ago

Compile and link the following code:

#include <stdio.h>
__attribute__((noinline)) int c() {
        printf("in c\n");
}
__attribute__((noinline)) int d() {
        printf("in d\n");
}
int test(int a, int b) {
        if (a == 0 && b > 0)
                c();
        printf("hello\n");
        d();
}

int main() {
        int a, b;
        scanf("%d%d", &a, &b);
        return test(a, b);
}

Using binutils 2.42 and gcc 13.2.0 from AOSC OS:

gcc -O2 test.c -o test

Generates invalid 0x00000000 instruction in test:

0000000000000980 <test>:
 980:   02ffc063        addi.d          $sp, $sp, -16
 984:   29c02061        st.d            $ra, $sp, 8
 988:   44000c80        bnez            $a0, 12 # 994 <test+0x14>
 98c:   60001c05        bgtz            $a1, 28 # 9a8 <test+0x28>
 990:   00000000        .word           0x00000000
 994:   18000224        pcaddi          $a0, 17
 998:   57fdabff        bl              -600    # 740 <puts@plt>
 99c:   28c02061        ld.d            $ra, $sp, 8
 9a0:   02c04063        addi.d          $sp, $sp, 16
 9a4:   53ffbfff        b               -68     # 960 <d>
 9a8:   57ff9bff        bl              -104    # 940 <c>
 9ac:   18000164        pcaddi          $a0, 11
 9b0:   57fd93ff        bl              -624    # 740 <puts@plt>
 9b4:   28c02061        ld.d            $ra, $sp, 8
 9b8:   02c04063        addi.d          $sp, $sp, 16
 9bc:   53ffa7ff        b               -92     # 960 <d>

Which can lead to SIGILL:

$ ./test
0 0
fish: Job 1, './test' terminated by signal SIGILL (Illegal instruction)

CC @xry111

xry111 commented 5 months ago

I cannot reproduce it. Note that AOSC is applying a bunch of patches onto GCC 13 (and maybe Binutils, I've not checked their Binutils repo).

With vanilla GCC 13.2.0 and -O2:

0000000000000910 <test>:
 910:   02ffc063    addi.d          $sp, $sp, -16
 914:   29c02061    st.d            $ra, $sp, 8
 918:   44000880    bnez            $a0, 8  # 920 <test+0x10>
 91c:   60001c05    bgtz            $a1, 28 # 938 <test+0x28>
 920:   1a000024    pcalau12i       $a0, 1
 924:   02e5c084    addi.d          $a0, $a0, -1680
 928:   57fdabff    bl              -600    # 6d0 <puts@plt>
 92c:   28c02061    ld.d            $ra, $sp, 8
 930:   02c04063    addi.d          $sp, $sp, 16
 934:   53ffd3ff    b               -48 # 904 <d>
 938:   57ffc3ff    bl              -64 # 8f8 <c>
 93c:   1a000024    pcalau12i       $a0, 1
 940:   02e5c084    addi.d          $a0, $a0, -1680
 944:   57fd8fff    bl              -628    # 6d0 <puts@plt>
 948:   28c02061    ld.d            $ra, $sp, 8
 94c:   02c04063    addi.d          $sp, $sp, 16
 950:   53ffb7ff    b               -76 # 904 <d>

With vanilla GCC 13.2.0 and -O2 -mno-explicit-relocs:

 908:   02ffc063    addi.d          $sp, $sp, -16
 90c:   29c02061    st.d            $ra, $sp, 8
 910:   44000880    bnez            $a0, 8  # 918 <test+0x10>
 914:   60001805    bgtz            $a1, 24 # 92c <test+0x24>
 918:   18000244    pcaddi          $a0, 18
 91c:   57fdb7ff    bl              -588    # 6d0 <puts@plt>
 920:   28c02061    ld.d            $ra, $sp, 8
 924:   02c04063    addi.d          $sp, $sp, 16
 928:   53ffdbff    b               -40 # 900 <d>
 92c:   57ffcfff    bl              -52 # 8f8 <c>
 930:   18000184    pcaddi          $a0, 12
 934:   57fd9fff    bl              -612    # 6d0 <puts@plt>
 938:   28c02061    ld.d            $ra, $sp, 8
 93c:   02c04063    addi.d          $sp, $sp, 16
 940:   53ffc3ff    b               -64 # 900 <d>

With GCC 14 (r14-8422 + some patches I've mentioned at https://github.com/loongson-community/areweloongyet/issues/16#issuecomment-1916876990):

 960:   02ffc063    addi.d          $sp, $sp, -16
 964:   29c02061    st.d            $ra, $sp, 8
 968:   44000880    bnez            $a0, 8  # 970 <test+0x10>
 96c:   60002405    bgtz            $a1, 36 # 990 <test+0x30>
 970:   18000284    pcaddi          $a0, 20
 974:   57fd5fff    bl              -676    # 6d0 <puts@plt>
 978:   28c02061    ld.d            $ra, $sp, 8
 97c:   02c04063    addi.d          $sp, $sp, 16
 980:   53ffc3ff    b               -64 # 940 <d>
 984:   03400000    nop
 988:   03400000    nop
 98c:   03400000    nop
 990:   57ff93ff    bl              -112    # 920 <c>
 994:   18000164    pcaddi          $a0, 11
 998:   57fd3bff    bl              -712    # 6d0 <puts@plt>
 99c:   28c02061    ld.d            $ra, $sp, 8
 9a0:   02c04063    addi.d          $sp, $sp, 16
 9a4:   53ff9fff    b               -100    # 940 <d>

Is there an assembly output from AOSC GCC?

xen0n commented 5 months ago

Cannot reproduce with almost-up-to-date snapshot of gcc and binutils-2.42 on Gentoo:

$ gcc --version
gcc (Gentoo 14.0.1_pre20240128 p17) 14.0.1 20240128 (experimental)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ld --version
GNU ld (Gentoo 2.42 p1) 2.42.0
Copyright (C) 2024 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
$ gcc -O2 issue41.c -o issue41
issue41.c: In function ‘main’:
issue41.c:17:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   17 |         scanf("%d%d", &a, &b);
      |         ^~~~~~~~~~~~~~~~~~~~~
$ echo 1 2 | ./issue41 
hello
in d
$ echo $?
5

$ gcc -O2 issue41.c -c -o issue41.o
issue41.c: In function ‘main’:
issue41.c:17:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   17 |         scanf("%d%d", &a, &b);
      |         ^~~~~~~~~~~~~~~~~~~~~
$ objdump -dwr issue41.o

issue41.o:     file format elf64-loongarch

Disassembly of section .text:

0000000000000000 <.Lla-relax-align>:
   0:   03400000    nop 0: R_LARCH_ALIGN    .Lla-relax-align+0x5
   4:   03400000    nop
   8:   03400000    nop
   c:   03400000    nop
  10:   03400000    nop
  14:   03400000    nop
  18:   03400000    nop

000000000000001c <c>:
  1c:   1a000004    pcalau12i       $a0, 0  1c: R_LARCH_PCALA_HI20  .LC0
    1c: R_LARCH_RELAX   *ABS*
  20:   02c00084    addi.d          $a0, $a0, 0 20: R_LARCH_PCALA_LO12  .LC0
    20: R_LARCH_RELAX   *ABS*
  24:   50000000    b               0   # 24 <c+0x8>    24: R_LARCH_B26 puts

0000000000000028 <L0^A>:
  28:   03400000    nop 28: R_LARCH_ALIGN   .Lla-relax-align+0x5
  2c:   03400000    nop
  30:   03400000    nop
  34:   03400000    nop
  38:   03400000    nop
  3c:   03400000    nop
  40:   03400000    nop

0000000000000044 <d>:
  44:   1a000004    pcalau12i       $a0, 0  44: R_LARCH_PCALA_HI20  .LC1
    44: R_LARCH_RELAX   *ABS*
  48:   02c00084    addi.d          $a0, $a0, 0 48: R_LARCH_PCALA_LO12  .LC1
    48: R_LARCH_RELAX   *ABS*
  4c:   50000000    b               0   # 4c <d+0x8>    4c: R_LARCH_B26 puts

0000000000000050 <L0^A>:
  50:   03400000    nop 50: R_LARCH_ALIGN   .Lla-relax-align+0x5
  54:   03400000    nop
  58:   03400000    nop
  5c:   03400000    nop
  60:   03400000    nop
  64:   03400000    nop
  68:   03400000    nop

000000000000006c <test>:
  6c:   02ffc063    addi.d          $sp, $sp, -16
  70:   29c02061    st.d            $ra, $sp, 8

0000000000000074 <L0^A>:
  74:   44001480    bnez            $a0, 20 # 88 <.L5>  74: R_LARCH_B21 .L5
  78:   60003405    bgtz            $a1, 52 # ac <L0^A> 78: R_LARCH_B16 .L7
  7c:   03400000    nop 7c: R_LARCH_ALIGN   .Lla-relax-align+0x4
  80:   03400000    nop
  84:   03400000    nop

0000000000000088 <.L5>:
  88:   1a000004    pcalau12i       $a0, 0  88: R_LARCH_PCALA_HI20  .LC2
    88: R_LARCH_RELAX   *ABS*
  8c:   02c00084    addi.d          $a0, $a0, 0 8c: R_LARCH_PCALA_LO12  .LC2
    8c: R_LARCH_RELAX   *ABS*
  90:   54000000    bl              0   # 90 <.L5+0x8>  90: R_LARCH_B26 puts
  94:   28c02061    ld.d            $ra, $sp, 8

0000000000000098 <L0^A>:
  98:   02c04063    addi.d          $sp, $sp, 16

000000000000009c <L0^A>:
  9c:   50000000    b               0   # 9c <L0^A> 9c: R_LARCH_B26 d
  a0:   03400000    nop a0: R_LARCH_ALIGN   .Lla-relax-align+0x4
  a4:   03400000    nop
  a8:   03400000    nop

00000000000000ac <L0^A>:
  ac:   54000000    bl              0   # ac <L0^A> ac: R_LARCH_B26 c
  b0:   1a000004    pcalau12i       $a0, 0  b0: R_LARCH_PCALA_HI20  .LC2
    b0: R_LARCH_RELAX   *ABS*
  b4:   02c00084    addi.d          $a0, $a0, 0 b4: R_LARCH_PCALA_LO12  .LC2
    b4: R_LARCH_RELAX   *ABS*
  b8:   54000000    bl              0   # b8 <L0^A+0xc> b8: R_LARCH_B26 puts
  bc:   28c02061    ld.d            $ra, $sp, 8

00000000000000c0 <L0^A>:
  c0:   02c04063    addi.d          $sp, $sp, 16
  c4:   50000000    b               0   # c4 <L0^A+0x4> c4: R_LARCH_B26 d

Disassembly of section .text.startup:

0000000000000000 <main-0x1c>:
   0:   03400000    nop 0: R_LARCH_ALIGN    .Lla-relax-align+0x5
   4:   03400000    nop
   8:   03400000    nop
   c:   03400000    nop
  10:   03400000    nop
  14:   03400000    nop
  18:   03400000    nop

000000000000001c <main>:
  1c:   02ff8063    addi.d          $sp, $sp, -32
  20:   29c04077    st.d            $s0, $sp, 16

0000000000000024 <L0^A>:
  24:   1a000017    pcalau12i       $s0, 0  24: R_LARCH_GOT_PC_HI20 __stack_chk_guard
    24: R_LARCH_RELAX   *ABS*
  28:   28c002f7    ld.d            $s0, $s0, 0 28: R_LARCH_GOT_PC_LO12 __stack_chk_guard
    28: R_LARCH_RELAX   *ABS*
  2c:   260002ec    ldptr.d         $t0, $s0, 0
  30:   02c01066    addi.d          $a2, $sp, 4
  34:   00150065    move            $a1, $sp
  38:   1a000004    pcalau12i       $a0, 0  38: R_LARCH_PCALA_HI20  .LC3
    38: R_LARCH_RELAX   *ABS*
  3c:   02c00084    addi.d          $a0, $a0, 0 3c: R_LARCH_PCALA_LO12  .LC3
    3c: R_LARCH_RELAX   *ABS*
  40:   29c0206c    st.d            $t0, $sp, 8
  44:   29c06061    st.d            $ra, $sp, 24

0000000000000048 <L0^A>:
  48:   54000000    bl              0   # 48 <L0^A> 48: R_LARCH_B26 __isoc99_scanf
  4c:   24000465    ldptr.w         $a1, $sp, 4
  50:   24000064    ldptr.w         $a0, $sp, 0
  54:   54000000    bl              0   # 54 <L0^A+0xc> 54: R_LARCH_B26 test
  58:   28c0206d    ld.d            $t1, $sp, 8
  5c:   260002ec    ldptr.d         $t0, $s0, 0
  60:   5c0021ac    bne             $t1, $t0, 32    # 80 <L0^A> 60: R_LARCH_B16 .L11
  64:   28c06061    ld.d            $ra, $sp, 24

0000000000000068 <L0^A>:
  68:   28c04077    ld.d            $s0, $sp, 16
  6c:   02c08063    addi.d          $sp, $sp, 32

0000000000000070 <L0^A>:
  70:   4c000020    ret
  74:   03400000    nop 74: R_LARCH_ALIGN   .Lla-relax-align+0x4
  78:   03400000    nop
  7c:   03400000    nop

0000000000000080 <L0^A>:
  80:   54000000    bl              0   # 80 <L0^A> 80: R_LARCH_B26 __stack_chk_fail

jiegec commented 5 months ago

Assembly and compiled object can be found at test.tar.gz

Assembly of test:

test:
.LFB13 = .
    .cfi_startproc
    addi.d  $r3,$r3,-16
    .cfi_def_cfa_offset 16
    st.d    $r1,$r3,8
    .cfi_offset 1, -8
    bnez    $r4,.L5
    bgt $r5,$r0,.L13
    .align  4,54525952,4
.L5:
    la.local    $r4,.LC2
    bl  %plt(puts)
    ld.d    $r1,$r3,8
    .cfi_remember_state
    .cfi_restore 1
    addi.d  $r3,$r3,16
    .cfi_def_cfa_offset 0
    b   d
    .align  4,54525952,4
.L13:
    .cfi_restore_state
    bl  c
    la.local    $r4,.LC2
    bl  %plt(puts)
    ld.d    $r1,$r3,8
    .cfi_restore 1
    addi.d  $r3,$r3,16
    .cfi_def_cfa_offset 0
    b   d
    .cfi_endproc
.LFE13:
    .size   test, .-test
    .section    .rodata.str1.8
    .align  3
.LC3:
    .ascii  "%d%d\000"

$ objdump -dwr test.o
000000000000006c <test>:
  6c:   02ffc063        addi.d          $sp, $sp, -16
  70:   29c02061        st.d            $ra, $sp, 8

0000000000000074 <L0^A>:
  74:   44000c80        bnez            $a0, 12 # 80 <.L5>      74: R_LARCH_B21 .L5
  78:   60002005        bgtz            $a1, 32 # 98 <L0^A>     78: R_LARCH_B16 .L13
  7c:   00000000        .word           0x00000000

0000000000000080 <.L5>:
  80:   1a000004        pcalau12i       $a0, 0  80: R_LARCH_PCALA_HI20  .LC2
        80: R_LARCH_RELAX       *ABS*
  84:   02c00084        addi.d          $a0, $a0, 0     84: R_LARCH_PCALA_LO12  .LC2
        84: R_LARCH_RELAX       *ABS*
  88:   54000000        bl              0       # 88 <.L5+0x8>  88: R_LARCH_B26 puts
  8c:   28c02061        ld.d            $ra, $sp, 8

0000000000000090 <L0^A>:
  90:   02c04063        addi.d          $sp, $sp, 16

0000000000000094 <L0^A>:
  94:   50000000        b               0       # 94 <L0^A>     94: R_LARCH_B26 d

0000000000000098 <L0^A>:
  98:   54000000        bl              0       # 98 <L0^A>     98: R_LARCH_B26 c
  9c:   1a000004        pcalau12i       $a0, 0  9c: R_LARCH_PCALA_HI20  .LC2
        9c: R_LARCH_RELAX       *ABS*
  a0:   02c00084        addi.d          $a0, $a0, 0     a0: R_LARCH_PCALA_LO12  .LC2
        a0: R_LARCH_RELAX       *ABS*
  a4:   54000000        bl              0       # a4 <L0^A+0xc> a4: R_LARCH_B26 puts
  a8:   28c02061        ld.d            $ra, $sp, 8

00000000000000ac <L0^A>:
  ac:   02c04063        addi.d          $sp, $sp, 16
  b0:   50000000        b               0       # b0 <L0^A+0x4> b0: R_LARCH_B26 d

jiegec commented 5 months ago

AOSC patches for GCC and Binutils.

jiegec commented 5 months ago

The problem still persists using binutils master (commit 44c91f530fc9acc3a535953696d49633e5a7e24a)

jiegec commented 5 months ago

Minimal reproducer:

        .section        .text,"ax",@progbits
        .align  2
        .align  5
        .globl  a
        .type   a, @function
a:
        .align  4,54525952,4

Disassembly of section .text:

0000000000000000 <.Lla-relax-align>:
   0:   03400000        nop             0: R_LARCH_ALIGN        .Lla-relax-align+0x5
   4:   03400000        nop
   8:   03400000        nop
   c:   03400000        nop
  10:   03400000        nop
  14:   03400000        nop
  18:   03400000        nop

000000000000001c <a>:
  1c:   00000000        .word           0x00000000

jiegec commented 5 months ago

I see: in binutils 2.42, for .align 4,54525952,4, binutils only takes the lowest byte (arg = 0, fill_len = 1) from 54525952, which is zero:

      if (arg >= 0)
    fill_len = 1;
      else
    fill_len = -arg;

      if (fill_len <= 1)
    {
      char fill_char = 0;

      fill_char = fill;
      do_align (align, &fill_char, fill_len, max);
    }

However, in binutils 2.41, arg=-4 and fill_len=4. The behavior was changed in https://github.com/bminor/binutils-gdb/commit/27a750dd896cfc13f4368e4c8df14e6ea5bb718f.

The number was written by gcc, see https://github.com/gcc-mirror/gcc/commit/b20c7ee066cb7d952fa193972e8bc6362c6e4063.

jiegec commented 5 months ago

So the answer is: binutils 2.42 assumes gcc with https://github.com/gcc-mirror/gcc/commit/b20c7ee066cb7d952fa193972e8bc6362c6e4063 applied.

xry111 commented 5 months ago

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

xen0n commented 5 months ago

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

xry111 commented 5 months ago

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

We may special case the 2nd operand of .align and handle it properly if it's the NOP encoding.

But why not just support 3-operand .align unconditionally (with any 2nd operand)? I don't think it's always nonsense if the user wants to fill the alignment padding with a different instruction: at least a different "do nothing" instruction like or $r4,$r4,$r0 should be fine.

And for things like -falign-functions and -falign-jumps we can actually fill break or non-exist instructions instead of NOP: the padding added by -falign-functions and -falign-jumps should be unreachable in any normal execution, so a break or non-exist instruction in these cases can actually serve as a cheap hardening (reducing ROP gadgets) and maybe a hint to the uarch for branch prediction.

However doing that will need a revision of ABI. Current (2.30) wording indeed only allows NOP:

Alignment statement. If the symbol index is 0, the addend indicates the number of bytes occupied by nop instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two. If the symbol index is not 0, the addend indicates the first and third expressions of .align. The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.

xen0n commented 5 months ago

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

We may special case the 2nd operand of .align and handle it properly if it's the NOP encoding.

But why not just support 3-operand .align unconditionally (with any 2nd operand)? I don't think it's always nonsense if the user wants to fill the alignment padding with a different instruction: at least a different "do nothing" instruction like or $r4,$r4,$r0 should be fine.

And for things like -falign-functions and -falign-jumps we can actually fill break or non-exist instructions instead of NOP: the padding added by -falign-functions and -falign-jumps should be unreachable in any normal execution, so a break or non-exist instruction in these cases can actually serve as a cheap hardening (reducing ROP gadgets) and maybe a hint to the uarch for branch prediction.

However doing that will need a revision of ABI. Current (2.30) wording indeed only allows NOP:

Alignment statement. If the symbol index is 0, the addend indicates the number of bytes occupied by nop instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two. If the symbol index is not 0, the addend indicates the first and third expressions of .align. The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.

Aww. Indeed this would need attention from the Loongson teams if such an improvement is to be made; but in order to additionally stuff a 4-byte second expression into the reloc addend, the expressible range of the third expression must be reduced. I don't think this would create any serious problem though, because I expect large "maximum skip"'s (larger than e.g. 16 bits) to be practically non-existent.

(See GAS doc on .balign and .p2align, the padding is at most 4 bytes long; .align is meant to emulate the host and possibly non-GNU assembler's behavior, and doesn't have its 2nd expr's semantics fully specified, but as LoongArch's canonical toolchain is GNU, we are free to just adopt the GAS standard behavior for our .align too.)

xen0n commented 5 months ago

Given we're little-endian throughout, if we just assume that very large maximum-skip values do not exist, the following change would be backwards-compatible (read: no real-world breakage expected):

-The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.
+The lowest 8 bits are used to represent the first expression, the bits 8 to 31 inclusive the third expression, and the higher 32 bits the second expression.
+ If the bitfield for the second expression reads as zero, it is to be treated as a NOP (0x03400000).

This is all assuming ELF64. Elf32_Rela's addend field is only 32 bits wide, there's no way to stuff a max-32-bit value into the field while also having to make the field contain the other expressions as well. Which means we might have to introduce another reloc record like R_LARCH_ALIGN_PAD for expressing the second expression after all, but things start looking really fragile, ~~and I definitely need more sleep or I'll most probably design bad ABI that would make me regret for the rest of my life~~.

xry111 commented 5 months ago

No, op2 is not needed to be encoded in the reloc. The instructions are already in the binary section and R_LARCH_ALIGN just delete some (or all) of them.

We just need to make the assembler stuff min((1 << op1) - 1, op3) / 4 instructions and an R_LARCH_ALIGN, and make the linker delete min((pc + (1 << op1) - 1) & ~((1 << op1) - 1), op3) / 4 instructions once we see an R_LARCH_ALIGN.

xry111 commented 5 months ago

GCC change backporting proposal: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645067.html

loongson-community / discussions

[Binutils] Generate invalid zero words in function #41