jserv / amacc

Small C Compiler generating ELF executable Arm architecture, supporting JIT execution
Other
1.01k stars 159 forks source link

Run main() with __libc_start_main() #19

Closed loganchien closed 8 years ago

loganchien commented 8 years ago

In a private conversation, @yodalee mentioned that the compiled executable won't flush the its standard output during its execution if we redirect the output to a file. It turns out to be a problem of the _start() stub. It calls main() function directly, thus libc does not have a chance to flush the buffer.

This commit fixes the problem by rewriting the _start() stub for elf32() function.

jserv commented 8 years ago

Cc. @lecopzer

lecopzer commented 8 years ago

It's a good issue. but I can't make check your PR, amacc doesn't support #define.

I also have some question https://github.com/jserv/amacc/pull/19/commits/3977b857e0b3610bfaf5cb1ca336bb3f3e303aee#diff-2ff88e27cdae923381ce81435b057d84R1098

In glibc-2.23 at file csu/elf-init.c

/* These functions are passed to __libc_start_main by the startup code.
   These get statically linked into each program.  For dynamically linked
   programs, this module will come from libc_nonshared.a and differs from
   the libc.a module in that it doesn't call the preinit array.  */

void
__libc_csu_init (int argc, char **argv, char **envp)
{
  /* For dynamically linked executables the preinit array is executed by
     the dynamic linker (before initializing any shared object).  */
#ifndef LIBC_NONSHARED
  /* For static executables, preinit happens right before init.  */
  {
    const size_t size = __preinit_array_end - __preinit_array_start;
    size_t i;
    for (i = 0; i < size; i++)
      (*__preinit_array_start [i]) (argc, argv, envp);
  }
#endif

#ifndef NO_INITFINI
  _init ();
#endif

  const size_t size = __init_array_end - __init_array_start;
  for (size_t i = 0; i < size; i++)
      (*__init_array_start [i]) (argc, argv, envp);
}

/* This function should not be used anymore.  We run the executable's
   destructor now just like any other.  We cannot remove the function,
   though.  */
void
__libc_csu_fini (void)
{
#ifndef LIBC_NONSHARED
  size_t i = __fini_array_end - __fini_array_start;
  while (i-- > 0)
(*__fini_array_start [i]) ();

# ifndef NO_INITFINI
  _fini ();
# endif
#endif
}

We can find the usage of libc_csu_init is to execute preinit_array_start array provided by linker.

In linker

__attribute__ ((section (".preinit_array")))
      typeof (__VLTUnprotectPreinit) *__preinit = __VLTUnprotectPreinit;

VLTUnprotectPreinit -> VLTChangePermission (int perm)

/* Change the permissions on all the pages we have allocated for the
data sets and all the ".vtable_map_var" sections in memory (which
contain our vtable map variables).  PERM indicates whether to make
the permissions read-only or read-write.  */

And when we use arm-linux-gnueabihf-gcc to compile a program, by its binary, we can see

00016d7c <__libc_csu_init>:
   16d7c: e92d 43f8   stmdb sp!, {r3, r4, r5, r6, r7, r8, r9, lr}
   16d80: 4606        mov r6, r0
   16d82: 4d0c        ldr r5, [pc, #48] ; (16db4 <__libc_csu_init+0x38>)
   16d84: 460f        mov r7, r1
   16d86: 4690        mov r8, r2
   16d88: f8df 902c   ldr.w r9, [pc, #44] ; 16db8 <__libc_csu_init+0x3c>
   16d8c: 447d        add r5, pc
   16d8e: f7f9 ebfe   blx 1058c <_init>
   16d92: 44f9        add r9, pc
   16d94: ebc9 0505   rsb r5, r9, r5
   16d98: 10ad        asrs  r5, r5, #2
   16d9a: d009        beq.n 16db0 <__libc_csu_init+0x34>
   16d9c: 2400        movs  r4, #0
   16d9e: f859 3024   ldr.w r3, [r9, r4, lsl #2]
   16da2: 4642        mov r2, r8
   16da4: 3401        adds  r4, #1
   16da6: 4639        mov r1, r7
   16da8: 4630        mov r0, r6
   16daa: 4798        blx r3
   16dac: 42ac        cmp r4, r5
   16dae: d1f6        bne.n 16d9e <__libc_csu_init+0x22>
   16db0: e8bd 83f8   ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
   16db4: 00010a14  .word 0x00010a14
   16db8: 00010a0a  .word 0x00010a0a

00016dbc <__libc_csu_fini>:
   16dbc: 4770        bx  lr                                                                                               
   16dbe: bf00        nop

the code and symbol was added to the source code but didn't call them anywhere. It seems they drop the two functions.

In what case we must call these two functions?

loganchien commented 8 years ago

@lecopzer: Sorry, I was unaware of the self-bootstrapping test. I have just refined the commits, which removes macro usages and some unsupported C functionality, such as static, const, and sizeof(name).

the code and symbol was added to the source code but didn't call them anywhere. It seems they drop the two functions.

Their address will be loaded by the _start() stub. It is in the first and the third GOT slot of _start(). Their addresses are passed to __libc_start_main() and will be invoked by __libc_start_main() later.

In what case we must call these two functions?

It is related to C++ global variables with constructors/destructors. Compiler will generate initialization/finalization functions for these global variables and place these function pointers in .init_array/.fini_array. __libc_csu_init()/__libc_csu_fini() will traverse the array and invoke these initializers/finalizers. This also applies to constructor and destructor function attributes in C.

Yet another use case is to run the code in the .init and .fini sections. But I am not sure whether this is for legacy support or not.

lecopzer commented 8 years ago

Okay I find these two functions was called in _start(). So we still have to add __libc_csu_init/fini to complete Elf. Thx for reply.

I think your PR can be merged @jserv

jserv commented 8 years ago

Thanks for the great work done by @loganchien !