Open EndilWayfare opened 3 years ago
I believe the feature you are requesting is possible today. However, it may not be obvious and PyOxidizer's config file may not support exactly what you want.
The OxidizedFinder
meta path finder exposes some APIs to Python that allow it to read and write packed resources data files/blobs. This is described at https://pyoxidizer.readthedocs.io/en/stable/oxidized_importer_freezing_applications.html. The underlying Rust code doing the work is used by both PyOxidizer and the oxidized_importer
Python package.
From PyOxidizer, the PythonEmbeddedResources
(https://pyoxidizer.readthedocs.io/en/stable/pyoxidizer_config_type_python_embedded_resources.htm) Starlark type represents a collection of Python resources. If you define a target function that returns an instance of this type, one of the files written will be loadable into OxidizedFinder
.
For what it's worth, I have aspirations of formalizing the custom resource serialization format and proposing it as an alternative mechanism for distributing Python packages. I think the value proposition of immutable single-file distributions of Python packages that are more performant than wheels is compelling. Unfortunately, the data format and features within need a lot of work before that can be considered. But the foundation for building your own alternative Python resources distribution format is definitely there in the code today!
I'd like to encourage more experimentation in this space. If there's any APIs or functionality that you'd like to see, please suggest ideas!
Cool! I'll definitely have to dig more into the innards, then! To be honest, this is my first exposure to Starlark/Bazel, and it's a custom dialect so it's not immediately obvious what's core to the language and what's PyOxidizer-specific (kinda like Jenkins Groovy).
Sounds compelling. For as much as pip
/PyPI/venv
is an improvement in dependency management over, say, C++ for sure, cargo
/crates.io beat them by a country mile. I still have a soft spot in my heart for Python (and it's definitely still the best tool for some jobs), so it would be fantastic to see some improvements to its dependency/package/distribution story.
Also, as far as docs go, I personally prefer rustdoc
to sphinx
/readthedocs. Somehow, I just find everything easier to grok and cross-reference. Maybe it makes less sense the more that a Rust project focuses value on it's CLI over it's API, plus readthedocs is more familiar to incoming Python folks. Regardless, likely because it's such an inherently complicated concept/process/implementation, some of the docs are a little impenetrable and some stuff only fully clicks (for me at least) after significant experimentation.
That's rather vague. Eh, at some point I'll have to put my money where my mouth is and try to contribute. :)
In a typical Python scientific computing project, the on-disk size is probably dominated by large dependencies like
scipy
andnumpy
(88 and 46 MB in my build, respectively), not to mention the standard library. These dependencies likely change at a far slower rate than the local Python and Rust code that makes up the rest of the project. Having to redistribute a 100 MB+ bundle every update is less than ideal.Granted, those large dependencies are also
__file__
users, and must be stored in a "lib" folder anyway. For now, it's possible to leave those alone and distribute the smaller binary (39 MB in my build) instead. However, that binary still contains the stdlib and other packages that can be classified, because I'd like to make use of the fast, zero-copy imports where I can. In future, once the package resources problem is (more) resolved, (more of) those dependencies will make it into the binary to enjoy the same benefits. This is key, I think: the benefit from packing dependencies into the binary is not just "single file" distribution.It would be really neat if we could package rarely-changing dependencies into a "dependencies.dll" file, or similar, to preserve fast imports while reducing update size. Dynamic linking being what it is, I imagine that this would be very nontrivial to implement cross-platform. Or maybe not; this is outside my area of expertise. Just wanted to raise it as a suggestion!