linkedin / shiv

shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.
BSD 2-Clause "Simplified" License
1.75k stars 98 forks source link

use shiv as "transparent" package compressor #109

Open sdementen opened 5 years ago

sdementen commented 5 years ago

Hello,

I have use cases where I want to deploy a "compact" python distribution in terms of files and Mb. I am using the python embedded distribution + shiv to come down to ~35 files and 48 Mb on windows with pandas+matplotlib which is what I want. However, the core entry point of the distribution should still be the python.exe and not the shiv file (I am using the python/powerBI itnerface that requires a python.exe).

So I am ending doing some "dark magic" in the sitecustomize.py to boostrap the "extra.pyz" (containing all dependencies not packaged with the python embedded distribution: pandas, matplotlib, etc) by reusing the logic of the _bootstrap.bootstrap() method but without the last part (the run or execute_interpreter call) which works.

Would there be some interest to add this use case (supporting on the fly cache compressed packages including .dll/.so files) to shiv capabilities ?

lorencarvalho commented 5 years ago

hi @sdementen!

Let me make sure I understand your use case - it sounds like you effectively want to use the pyz to bundle some third party dependencies with a small embedded python distribution, but you don't want to invoke the pyz itself, rather, you want to invoke the interpreter and are using a sitecustomize.py file to bootstrap the site-packages from your pyz? Very clever!

I'm not sure what supporting this use case would look like, other than refactoring the bootstrap function into two phases (preparing sys.path, and actually invoking the entrypoint/interpreter). Is that what you had in mind?

sdementen commented 5 years ago

You got it exactly! The benefit is that the distributable part is smaller. I am using OneDrive to sync the distribution from network drive to local user drives. Syncing 30k files over 240Mb or 30 files on 40Mb is very different (the first being impractical in terms of speed and throttling). There may be other use cases than this one though.

To support this, it is indeed about splitting the bootstrap into the two parts you mentioned adding also the pyz into to the sys.path (before calling the bootstrap) to be able to import shiv and removing it after the first part... But that may not be doable from shiv itself. It may require the generation of shiv.pth, the addition of the pyz in the PYTHON PATH or an instruction to add something into the shebang of the pyz (or go with the creation/change of the sitecustomize.py). And then, probably some documentation of the use case and the proper options in the shiv executable (but I am stating here the obvious I guess)

jhermann commented 4 years ago

@sdementen Please take a look at #134 if that fits your use-case (not sure it does), with a lesser amount of dark magic.