python / cpython

The Python programming language
https://www.python.org
Other
62.77k stars 30.08k forks source link

`$COLUMNS` environment variable present in subprocess only when `env=None` #100516

Open asottile opened 1 year ago

asottile commented 1 year ago

Bug report

for some reason which I cannot determine (perhaps a subtle difference between execvp and execvpe?) the $COLUMNS variable is mysteriously present in subprocesses when using env=None but not when using env=os.environ (which I expect to be identical in behaviour). this also seems to only happen with an interactive session?

here's a minimal case:

import sys, subprocess, os

subprocess.check_output(
    (sys.executable, '-c', 'import os; print(os.environ["COLUMNS"])'),
    # env=os.environ,
)
$ python3.11 -i t4.py
>>> 
$ python3.11 t4.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'COLUMNS'
Traceback (most recent call last):
  File "/home/asottile/workspace/cpython/t4.py", line 3, in <module>
    subprocess.check_output(
  File "/usr/lib/python3.11/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/usr/bin/python3.11', '-c', 'import os; print(os.environ["COLUMNS"])')' returned non-zero exit status 1.

if I un-comment the line in the example script above -- it fails in both places:

$ sed -i 's/#//g' -- t4.py
$ python3.11 -i t4.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'COLUMNS'
Traceback (most recent call last):
  File "/home/asottile/workspace/cpython/t4.py", line 3, in <module>
    subprocess.check_output(
  File "/usr/lib/python3.11/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/usr/bin/python3.11', '-c', 'import os; print(os.environ["COLUMNS"])')' returned non-zero exit status 1.
>>> 
$ python3.11 t4.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<frozen os>", line 679, in __getitem__
KeyError: 'COLUMNS'
Traceback (most recent call last):
  File "/home/asottile/workspace/cpython/t4.py", line 3, in <module>
    subprocess.check_output(
  File "/usr/lib/python3.11/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('/usr/bin/python3.11', '-c', 'import os; print(os.environ["COLUMNS"])')' returned non-zero exit status 1.

normally $COLUMNS is a shell variable and not an environment variable, but that doesn't really explain where it's coming from and why it's different depending both on interactiveness and on env= parameter:

env=None env=os.environ
-i

Your environment

matthewhughes934 commented 1 year ago

With -i passed sys.__interactivehook__ is called at startup (https://docs.python.org/3/library/sys.html#sys.__interactivehook__) the default hook will import readline (via https://github.com/python/cpython/blob/046cbc2080360b0b0bbe6ea7554045a6bbbd94bd/Lib/site.py#L485 and https://github.com/python/cpython/blob/046cbc2080360b0b0bbe6ea7554045a6bbbd94bd/Lib/site.py#L440).

GNU readline has a setting that modifies the LINES and COLUMNS variables in the environment, under certain circumstances, which defaults to 1 (i.e. those variables will be modified) https://tiswww.case.edu/php/chet/readline/readline.html#index-rl_005fchange_005fenvironment.

You can demonstrate this by adding import readline to the top of your PoC and then running it e.g. just as python t.py

asottile commented 1 year ago

that partially explains it -- but it doesn't explain the difference between env=None and env=os.environ I think?

asottile commented 1 year ago

ok I think the actual explanation is that the C environ can be out of sync of the python os.environ -- for example :

#include <Python.h>
#include <stdlib.h>

static PyObject* _hello_world(PyObject* self) {
    putenv("asottile=wat");
    return PyUnicode_FromString("hello world");  // python c api
}

static struct PyMethodDef methods[] = {
    {"hello_world", (PyCFunction)_hello_world, METH_NOARGS},
    {NULL, NULL}
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "_hello",
    NULL,
    -1,
    methods
};

PyMODINIT_FUNC PyInit__hello(void) {
    return PyModule_Create(&module);
}

and then:

>>> import subprocess, sys, os, _hello
>>> _hello.hello_world()
'hello world'
>>> subprocess.call((sys.executable, '-c', 'import os; print(os.environ["asottile"])'))
wat
0
>>> subprocess.call((sys.executable, '-c', 'import os; print(os.environ["asottile"])'), env=os.environ)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/os.py", line 679, in __getitem__
    raise KeyError(key) from None
KeyError: 'asottile'
1
matthewhughes934 commented 1 year ago

ok I think the actual explanation is that the C environ can be out of sync of the python os.environ

:+1: yep, for (my) reference, the docs say: https://docs.python.org/3/library/os.html#os.environ

This mapping is captured the first time the os module is imported, typically during Python startup as part of processing site.py. Changes to the environment made after this time are not reflected in os.environ, except for changes made by modifying os.environ directly.