A way to configure sys.executable flags at PEX runtime

bkad commented 1 month ago

I noticed that Python flags are supported in interpreter mode using the inject_args flag and wonder if something similar can be supported running entry points.

jsirois commented 1 month ago

@bkad can you give a concrete example of what you're aiming for here? Passing flags to entry points is definitely supported already using --inject-args. See this example use: https://docs.pex-tool.org/recipes.html#uvicorn-and-other-customizable-application-servers. What you can't do is pass flags to CPython itself when running an entry point (like, say -B to skip bytecode generation). Is that the style of thing you want to do here?

bkad commented 1 month ago

What you can't do is pass flags to CPython itself

This is what I'm trying to do, pass -u to CPython. Alternatively, I could use the PYTHONUNBUFFERED environment variable, but I'd like to configure this within the PEX at build time rather than everywhere the PEX is executed. Passing PYTHONUNBUFFERED through --inject-env has no effect on the re-executed interpreter for some reason.

jsirois commented 1 month ago

Ok. Yeah, you generally can't do this today. Only one incantation gets this to work, and then only after the 1st run:

exe.py:

import sys

if __name__ == "__main__":
  print(sys.stdout.buffer)

Pex command:

:; pex --exe exe.py -o test.pex --python-shebang "/usr/bin/env python -u" --sh-boot

Nets:

:; ./test.pex
<_io.BufferedWriter name='<stdout>'>
:; ./test.pex
<_io.FileIO name='<stdout>' mode='wb' closefd=False>
:; ./test.pex
<_io.FileIO name='<stdout>' mode='wb' closefd=False>
...

Why this works is fairly arcane, but basically --sh-boot respects args it finds in --python-shebang (even if the --python-shebang specified is not valid!) and, after 1st boot when the PEX is laid out on disk under the PEX_ROOT, --sh-boot optimizes by directly calling <full path to compatible python it discovered> <python args from --python-shebang> <installed PEX path/__main__.py> "$@". Notably, this only works for plain PEXes and not --venv PEXes. Since most PEX users probably want both --sh-boot and --venv for minimum boot latency and max Python ecosystem compatibility, the example incantation is not generally viable.

In summary, @bkad this seems like a good feature to support across all PEX styles.

jsirois commented 1 month ago

For a bit more insight on the --sh-boot machinery, using the example test.pex above:

:; head -123 test.pex
#!/bin/sh
# N.B.: This script should stick to syntax defined for POSIX `sh` and avoid non-builtins.
# See: https://pubs.opengroup.org/onlinepubs/9699919799/idx/shell.html
set -eu

VENV=""
VENV_PYTHON_ARGS="-sE"

# N.B.: This ensures tilde-expansion of the DEFAULT_PEX_ROOT value.
DEFAULT_PEX_ROOT="$(echo ~/.pex)"

DEFAULT_PYTHON="/usr/bin/env"
DEFAULT_PYTHON_ARGS="python -u"

PEX_ROOT="${PEX_ROOT:-${DEFAULT_PEX_ROOT}}"
INSTALLED_PEX="${PEX_ROOT}/unzipped_pexes/466b0553ddeea983208b65ba0b5f9870e10b1127"

if [ -n "${VENV}" -a -x "${INSTALLED_PEX}" ]; then
    # We're a --venv execution mode PEX installed under the PEX_ROOT and the venv
    # interpreter to use is embedded in the shebang of our venv pex script; so just
    # execute that script directly.
    export PEX="$0"
    if [ -n "${VENV_PYTHON_ARGS}" ]; then
        exec "${INSTALLED_PEX}/bin/python" "${VENV_PYTHON_ARGS}" "${INSTALLED_PEX}" \
            "$@"
    else
        exec "${INSTALLED_PEX}/bin/python" "${INSTALLED_PEX}" "$@"
    fi
fi

find_python() {
    for python in \
"python3.12" \
"python3.11" \
"python3.10" \
"python3.9" \
"python3.8" \
"python3.7" \
"python3.6" \
"python3.5" \
"python2.7" \
"pypy3.12" \
"pypy3.11" \
"pypy3.10" \
"pypy3.9" \
"pypy3.8" \
"pypy3.7" \
"pypy3.6" \
"pypy3.5" \
"pypy2.7" \
"python3" \
"python2" \
"pypy3" \
"pypy2" \
"python" \
"pypy" \
    ; do
        if command -v "${python}" 2>/dev/null; then
            return
        fi
    done
}

if [ -x "${DEFAULT_PYTHON}" ]; then
    python_exe="${DEFAULT_PYTHON} ${DEFAULT_PYTHON_ARGS}"
else
    python_exe="$(find_python)"
fi
if [ -n "${python_exe}" ]; then
    if [ -n "${PEX_VERBOSE:-}" ]; then
        echo >&2 "$0 used /bin/sh boot to select python: ${python_exe} for re-exec..."
    fi
    if [ -z "${VENV}" -a -e "${INSTALLED_PEX}" ]; then
        # We're a --zipapp execution mode PEX installed under the PEX_ROOT with a
        # __main__.py in our top-level directory; so execute Python against that
        # directory.
        export __PEX_EXE__="$0"
        exec ${python_exe} "${INSTALLED_PEX}" "$@"
    else
        # The slow path: this PEX zipapp is not installed yet. Run the PEX zipapp so it
        # can install itself, rebuilding its fast path layout under the PEX_ROOT.
        if [ -n "${PEX_VERBOSE:-}" ]; then
            echo >&2 "Running zipapp pex to lay itself out under PEX_ROOT."
        fi
        exec ${python_exe} "$0" "$@"
    fi
fi

echo >&2 "Failed to find any of these python binaries on the PATH:"
for python in \
"python3.12" \
"python3.11" \
"python3.10" \
"python3.9" \
"python3.8" \
"python3.7" \
"python3.6" \
"python3.5" \
"python2.7" \
"pypy3.12" \
"pypy3.11" \
"pypy3.10" \
"pypy3.9" \
"pypy3.8" \
"pypy3.7" \
"pypy3.6" \
"pypy3.5" \
"pypy2.7" \
"python3" \
"python2" \
"pypy3" \
"pypy2" \
"python" \
"pypy" \
; do
    echo >&2 "${python}"
done
echo >&2 "Either adjust your $PATH which is currently:"
echo >&2 "${PATH}"
echo >&2 -n "Or else install an appropriate Python that provides one of the binaries in "
echo >&2 "this list."
exit 1
P!
  .bootstrap/P!.bootstrap/pex/P!x�_ek␦.bootstrap/pex/__init__.pySVp�/�,�L�(Q0204H�P((��JM.QH��+)�L*-�/*��RV��LN�+NMQ(�KI-R(�HUp,HLRP��Ԣ���<#=��T�����_���P!(�e��␦.bootstrap/pex/__main__.py5��

Basically all the Pex python bootstrap logic is replaced with shell to drop boot overhead from ~50ms to ~1ms (Python is SLOW).

jsirois commented 1 month ago

Noting that the trick to all this will be protecting -sE which is the standard set of Pex python args used to enforce the hermeticity guarantees of PEX files (the -E is only meant to turn off PYTHONPATH but it also turns off things like PYTHONUNBUFFERED). ~All Python flags besides -s and ~all PYTHON* env vars besides PYTHONPATH are likely safe to let propagate to the final exec'd process.

jsirois commented 1 month ago

Ok, some ctypes hackery to implement sys.orig_argv for CPython<3.10 gets this[^1] working for all CPythons Pex supports and PyPy>=3.10. It will not work for the PyPy>=2.7,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,<3.10 that Pex supports because I can find no way to access the original PyPy process argv for those PyPy releases.

[^1]: Where "this" is more than the OP. It's support for python -u my.pex as the basis for the OP request which will use a new --inject-python-args option similar to the existing --inject-args option (with no support for leaking PYTHON* at all; i.e.: the ubiquitous -sE used by Pex behind the scenes stays in force).

jsirois commented 4 weeks ago

Alrighty, this is now released on 2.4.0,: https://github.com/pex-tool/pex/releases/tag/v2.4.0

Thanks @bkad.

bkad commented 4 weeks ago

Much appreciated!

pex-tool / pex

A way to configure sys.executable flags at PEX runtime #2422