bazelbuild / rules_python

Bazel Python Rules
https://rules-python.readthedocs.io
Apache License 2.0
521 stars 536 forks source link

Hermetic `chmod`/`touch`/`id` in `python_repository` #2016

Open mattyclarkson opened 3 months ago

mattyclarkson commented 3 months ago

🚀 feature request

Relevant Rules

python_repository

Description

We currently call out the to system chmod, touch and id binaries.

We could do the equivalent functionality with Python.

Describe the solution you'd like

The Python interpreter is downloaded in python_repository. If we provide a script as a private attribute to the repository rule, we can perform the readonly permissions change with the downloaded Python interpreter.

Describe alternatives you've considered

None. Python has the functionality to perform the chmod/touch/id.

aignas commented 3 months ago

I am not sure we can do it:

  1. The chmod is there to make the filesystem read-only to avoid having pyc files in the toolchain and we run the python interpreter we have just downlaoded in the said repository, then we will create pyc files, which make the builds non-deterministic.
  2. Touch is just a test that checks that we have succeeded in creating a read-only directory, we could use Python for that, maybe? However, I think it would be better to change logic to attempt to delete a file using repository_ctx.delete.
  3. Maybe it is possible to use Python here.

What is the problem here that you are trying to solve? I'm thinking that we are missing context here (see xy problem)

mattyclarkson commented 3 months ago

then we will create pyc files

For the chmod step, I thought we could use -B or PYTHONDONTWRITEBYTECODE to prevent writing the .pyc files.

I think it would be better to change logic to attempt to delete a file using repository_ctx.delete.

Agreed. That would be a good solution.

What is the problem here that you are trying to solve?

Apologies, I should have been more explicit in the description.

In our organisation, we run our Bazel inside a container that has zero core utilites installed to prove that our builds are truely hermetic. We resolve our core utilties through rules_coreutils. This helps us avoid version differences in core utilities or the GNU/BSD differences. Unfortunately, for rules_python it means that we hit the chmod/id/touch missing binaries.

In this case, it is entirely nitpicky: the usage is POSIX compliant. However, it seems that we could get away with doing it with Python, remove the dependency and make the download hermetic with respect to the host system.

aignas commented 3 months ago

One reservation that I have is that it suffers from the same problem - how do you handle downloading for the platform that is not the host platform. Then you need a reference to the host interpreter. That said, only the host interpreter needs this special treatment, because the other ones just get packaged up without being used in any way.

I personally think that this could be an interesting approach to dropping some system deps.

PRs welcome. :)

rickeylev commented 3 months ago

re: -B flag and writing bytecode

It's fine for the pyc to be generated at repo-rule time, for a few reasons:

  1. As far as loading/analysis phase is concerned, they're just source files, so they're static and unchanging. The non-determinism concern is if they're generated at build time.
  2. The globs for the runtime's files ignore pyc files. This prevents them from being included in build analysis (the runtime might still use them if it isn't e.g. sandboxed)
  3. Since the originating .py files don't change, Python will use the existing pyc and not generate a new one.

We could probably precompile the whole stdlib using //tools/precompiler at repo install time and then we could drop the .pyc exlcusion from the globs

mattyclarkson commented 2 months ago

We could probably precompile the whole stdlib using //tools/precompiler at repo install time and then we could drop the .pyc exlcusion from the globs

Agree. Would we be happy with a follow up issue/PR? Does making the current setup hermetic satisfy this issue?