JonathonReinhart / staticx

Create static executable from dynamic executable
https://staticx.readthedocs.io/
Other
319 stars 35 forks source link

Generate uncompressed folder with all dependencies #183

Open galdolber opened 3 years ago

galdolber commented 3 years ago

Great project! we were able to deploy a project with it successfully to aws lambda.

We are experiencing a delay in startup (I guess for decompressing the resources).

Is there any way to generate a folder with the static binary and all dependencies instead of resulting it a single compressed file?

JonathonReinhart commented 3 years ago

Hi @galdolber!

We are experiencing a delay in startup (I guess for decompressing the resources).

Yes, this is likely a combination of the CPU time spent on decompression, as well as the IO delay associated with writing out the decompressed files to the temporary directory. I would like to add some profiling to the bootloader (#161) to understand this delay for various builds and hardware.

You can try the (undocumented) --no-compress option to omit the compression of the archive, and thus avoid the decompression time during at startup. I'd be interested to now how you're measuring the startup delay, and if this makes much of a difference for you.

Is there any way to generate a folder with the static binary and all dependencies instead of resulting it a single compressed file?

Currently, no. This sounds similar to the --onedir option of PyInstaller.


Another idea I had (and need to open an issue for) was to support a "persistent cache" mechanism, where the application would:

This would provide the convenience of shipping a single binary (a huge motivator for staticx), but provide a potentially big performance boost.


Related bootloader issues: #160, #161, #162, #163

galdolber commented 3 years ago

Hi @JonathonReinhart,

Thanks for the response, I'll test with --no-compress and let you know how it goes.

With aws lambda its very easy to see the init time, as it's reported by aws itself in the logs:

REPORT RequestId: e829655e-bfb8-4eed-aa48-3ae15adcadd0 Duration: 3893.39 ms Billed Duration: 8537 ms Memory Size: 3000 MB Max Memory Used: 701 MB Init Duration: 4643.57 ms

We are getting around 5 seconds of initialization right now.

The persistent cache idea sounds great, it wont work well for lambdas using the /tmp folder, as this is always empty when the lambda initialized. But it can work well when using lambda with EFS.

JonathonReinhart commented 3 years ago

We are getting around 5 seconds of initialization right now.

I wonder how much of that is actually due to staticx startup time. I've never used AWS Lambda, but based on this:

The Init phase ends when the runtime and all extensions signal that they are ready by sending a Next API request.

I can imagine there's potential for a lot of additional startup time from the time staticx invokes your application, until it makes that first Next request.

If you're interested, try out this branch: 161-bootloader-profile and run staticx with --debug. Then you'll see a line like this written to stderr, (which I believe Lambda will let you see):

STATICX [12765]: [PROFILE] extract_archive() total:      0.510039318 sec

You can then try that with/without --no-compress.

galdolber commented 3 years ago

Just tried out to compile with --no-compress but sadly the final zipped deployment package is 52MB, and the limit for lambda deployment is 50MB, so I cannot try it out.

Is there any way to exclude some of the libraries that are included in the final binary? I'm compiling an app that uses libvips and some golang, and I suspect many dependencies are not being used in runtime. I'm already doing a custom build of libvips without many of its dependencies, but I'm sure there's more I could exclude.

I'll try to compile 161-bootloader-profile and give it a try to get more information. But just did the following experiment: Running the binary from a normal linux server (not lambda) also takes around 5 seconds to start up. But if I copy the extracted folder and then stop the original process, and restore the tmp folder, when I run the binary uncompressed it starts instantly. (first gif is the uncompressed one)

Screen Recording 2021-07-26 at 1 13 46 AM Screen Recording 2021-07-26 at 1 12 26 AM

brettdh commented 1 year ago

@JonathonReinhart I was also hoping to use staticx with pyinstaller --onedir. Any news to share on that front? How big of a project would it be?

In addition to the increased startup time, I'm also encountering another problem (which I can open a separate issue for if you like) that I was hoping would be solved by using --onedir. It goes something like this:

  1. Build an executable using pyinstaller and staticx, in a python:3.11 docker container
    • The executable in question uses circus to run a background watcher daemon that manages child processes
    • I'm running the executable from a bash script that sets TMPDIR before execing the bundle built by staticx
  2. Install the executable on a Linux system with an older version of glibc
  3. Start the watcher daemon (don't actually daemonize, so that the output can be seen in the shell)
  4. Occasionally, the child process will start up using the system libc in /lib/libc.so.6, rather than the one in /path/to/staticx/tmpdir/_MEIxxxxxx. This results in errors like this:
    /bin/bash: /lib/libc.so.6: version `GLIBC_2.33' not found (required by /path/to/staticx/tmpdir/_MEIQX3gJl/libtinfo.so.6)

    That libc has a max version of GLIBC_2.24:

    $ strings /lib/libc.so.6  | egrep '^GLIBC_[[:digit:]]' | sort -V | tail -n1
    GLIBC_2.24

I'm guessing that this is some sort of race condition on the unpacking of the staticx archive, but I'm not sure. That's why I was hoping to test with --onedir to see if that works around the problem.

Or, do I need to do something more careful in the wrapper bash script to ensure the right libc gets used?

brettdh commented 1 year ago

Ooh, hmm. #239 might already have some answers for me for the glibc issue. Will try things and report back.

brettdh commented 1 year ago

Update: I'm still stuck.

One detail I realize I left out here: both the daemon process that runs the circus arbiter (process manager) and the child process managed by that arbiter (as a watcher) are the same staticx+pyinstaller bundled application, just with different cmdline args.

If I set LD_LIBRARY_PATH from LD_LIBRARY_PATH_ORIG with the circus watcher config (which spawns my bundled executable as a child process), it silently fails to start for a while, then eventually starts up. If I leave LD_LIBRARY_PATH unmodified in the watcher config, I get the version 'GLIBC_2.33' not found error above (which also eventually goes away).

I also tried not using the bash wrapper script with the watcher config (i.e. using the bundled executable directly, since TMPDIR was already set in the env); IIRC this resulted in the silent failure.

brettdh commented 1 year ago

I may have figured it out after running staticx with --debug and having an insight from the output.

As I mentioned, I'm building an executable using staticx and pyinstaller, then using that executable both as a watcher daemon and as the watched child processes. However, I neglected to mention one other use: the command that starts the watcher daemon (using the traditional daemonization method as implemented by circus).

I think this is (now) where things are going wrong:

  1. The "start daemon" command starts up
  2. The bootloaders unpack their tmpdirs
  3. The process forks
  4. The parent exits, cleaning up the tmpdirs
  5. The child continues, but its tmpdirs are already gone after which... bad things happen?

I don't yet have evidence of the "bad things" in (5), because I'm having difficulty capturing the stdio streams of the child; I just have the observation that the child exits immediately. Is this a plausible failure scenario? If so, is there a workaround? Or is the whole thing a red herring?

If this is the problem, it makes me wonder why the child is ever able to start up and keep running.

This related pyinstaller issue ended on a not so hopeful note: https://github.com/pyinstaller/pyinstaller/issues/5843

brettdh commented 1 year ago

Confirmed this is the issue. Adding a time.sleep(10) in the parent after it forks gives the child time to start up and read whatever it needs from the tmpdir, before the parent cleans up the tmpdir. Not yet sure if the child is reading from the staticx tmpdir, the pyinstaller tmpdir, or both, though.

Support for --onedir would neatly avoid all these subtle issues as well.