Open rurban opened 8 years ago
With 7d2f258f586d854e7ba4f2500ee189dafd304ef4 I can successfully dump a binary cperl executable from any script, even -e. (darwin so far, the others need to be tested).
Just the initialization order is wrong, gv_fetch fails to retrieve the dynamic $^X
, which is still empty.
the emacs undump code for win32 is unusable by perl. emacs uses a custom malloc that allows storing and restoring the custom heap from a disk file. I also cant figure out how the emacs code is going to recreate and make valid again all the FDs from the frozen proc. XS DLLs and 3rd party DLLs need to be frozen and unfrozen too. Win32 unexec code pretty much would have to use https://msdn.microsoft.com/en-us/library/windows/desktop/ms680360%28v=vs.85%29.aspx to make a memory dump file, then reinflate it, and tweak the PEB and TEB structs to register all the Win32 heaps to the master linked list of heaps. There might also be drama in having to defeat ASLR/C stack buffer overflow sentinal patterns.
unexec has its own malloc, yes, to be able to access old dumped memory.
Parallel to perlcc IO in BEGIN blocks or before the dump opcode may not be replayed. This is a known limitation, and already known from perlcc. open/chdir being the worst. I might think of adding hacks to reopen FD's, which would be easier than with B::C.
Dynamic modules are correctly loaded with unexec. The corresponding section handles this, e.g. LC_LOAD_DYLIB on darwin. ASLR is also handled correctly by rebasing the dumped sections.
works for simple scripts, because it's trivial there.
$ ./miniperl -Ilib -u -e'print "ok\n"'
$ ./a.exe
ok
accessing argv/argc fails on the empty PL_argvgv symbol while dumping. init_argv_symbols/init_postdump_symbols is uninitialized for -u.
/* init_postdump_symbols not currently designed to be called */
/* more than once (ENV isn't cleared first, for example) */
/* But running with -u leaves %ENV & @ARGV undefined! XXX */
if (!PL_do_undump)
init_postdump_symbols(argc,argv,env);
#3 0x00007ffff6d3d966 in malloc_printerr (action=3,
str=0x7ffff6e2c442 "corrupted double-linked list", ptr=<optimized out>, ar_ptr=<optimized out>)
at malloc.c:5007
#4 0x00007ffff6d3e936 in _int_free (av=0x7ffff7064b20 <main_arena>, p=<optimized out>,
have_lock=0) at malloc.c:4006
#5 0x000000000054f011 in Perl_safesysfree (where=0xad0290) at util.c:390```
Run init_postdump_symbols 2x with -u: We need the %ENV & @ARGV symbols during BEGIN, and we need to re-initialize in dumped binaries.
/* init_postdump_symbols not currently designed to be called */
/* more than once (ENV isn't cleared first, for example) */
/* But running with -u leaves %ENV & @ARGV undefined! XXX */
init_postdump_symbols(argc,argv,env);
sources from emacs, and re-enable -u and dump. with support for elf, coff, darwin, cygwin, win32/64, hpux, aix, sunos/solaris, dos.
TODO:
@dl_modules
, and then mmap the rest. the question is how to avoid all the pointer updates for different bases, or if MAP_FIXED can be used. ASLR is a problem here. See https://stackoverflow.com/questions/6446101/how-do-i-choose-a-fixed-address-for-mmapSee
feature/gh176-unexec