TriggerMail / rules_pyz

Bazel Python rules that package everything in an executable zip
Apache License 2.0
29 stars 19 forks source link

Hermetic python #39

Open Globegitter opened 5 years ago

Globegitter commented 5 years ago

One thing that is quite nice about bazel is that it provides hermeticity, i.e. it does not depend on your system tools (at least to some degree). As an example rules_go and rules_nodejs do that, on first invocation they download the sdk / nodejs and only that is used to build/run your apps. I think it is pretty nice because one does not have to worry about what they have installed on their system (or in the docker container) and bazel just provides the version needed and everyone is exactly on the same version. That in combination with the isolation should provide some pretty nice guarantees and also a nicer dev experience especially if it could support different interpreter versions in a project (at the very least py2/py3).

Is that something you have considered before?

joshclimacell commented 5 years ago

pyz_image doesn't satisfy this?

Globegitter commented 5 years ago

Yeah true that is one way to get this. But it does have the disadvantage on needing docker to run, which part of the appeal of bazel for us is, to not need it for dev and as I mentioned it also is nice to not have to worry about setting anything up correctly/coherently, which would still be needed for generating deps and running tests.

And for building the image in the first place, could it make a difference e.g. if python points to py3 or py2? I know in the bazelbuild rules it can make a difference (e.g. for pip deps) but here maybe not? And can a docker container also be built on mac with these rules?

Either way, I was not necessarily saying that these rules should do that, I was more curios if there where some thoughts already put into this, and if there where any big reasons thought of (apart from possibly time) to not do it?

joshclimacell commented 5 years ago

Thanks for the clarification! Actually shipping Python and dependent C libraries in each pyz_binary sounds like a heavy lift to me (although definitely possible)... The pyz_image rules @evanj put together do force you to pick a Python version -- is that what you're wondering about? And I think they're based on the Bazel docker rules, which don't require a docker install to build images. But yeah you need docker to run them.

det commented 5 years ago

You should be able to get hermetic python if you do something like this:

Package python with its binaries and standard library into a directory or external repository, this is the hardest part.

Use the pyz rules as follows (assuming you have made an external repo called @python_repo// with a filegroup globbing everything called "everything"):

pyz_binary(
    ...
    data = ["@python_repo//:everything"],
    interpreter_path = "${RUNFILES}/python_repo/usr/bin/python3",
)
evanj commented 5 years ago

Summary: I think making some sort of Python toolchain, analogous to the Go rule's tool chains, would be very useful. I haven't even thought about how to make this work though. If someone would like to attempt the Bazel magic to work on this, I would love to help.

I haven't really needed a separate Python install yet. However, with Python3, it would be extremely useful (since Mac OS X does not ship it).

The "main" script used by pyz rules has a bunch of hacks in it to attempt to only load the standard library from the "system" interpreter, and not anything else. This is because we have run into all sorts of weird issues that happen when people have some slightly broken Python install, or are running targets from inside a virtualenv, etc. A hermetic Python install would side step much of this.

Google's "Operation Purple Boa" plan seems to suggest doing something like this: https://docs.google.com/document/d/1dQjbbLEJqxUIJWmH5sIZAv-_emnKksDI-VCG1v86dWA/edit