Closed 80d0d833-181a-4df0-a469-6e7405b7821b closed 5 years ago
CPython crashes with "pymain_compute_path0: memory allocation failed" when attempting to execute it with a library module from a previously deleted directory.
To reproduce:
cd \~/tmp/python_crash
rm -rf \~/tmp/python_crash
python3.7 -m pdb Fatal Python error: pymain_compute_path0: memory allocation failed ValueError: character U+ef3a8fe0 is not in range [U+0000; U+10ffff]
Current thread 0x000000011060e5c0 (most recent call first):
This is happening because _Py_wgetcwd returns NULL (although the underliying getcwd system call populates the fullpath
variable correctly) but its caller, _PyPathConfig_ComputeArgv0, does not check the return value:
_Py_wgetcwd(fullpath, Py_ARRAY_LENGTH(fullpath));
argv0 = fullpath;
and its caller, pymain_run_python, interprets a failure in _PyPathConfig_ComputeArgv0 as a memory problem:
PyObject *path0 = _PyPathConfig_ComputeArgv0(config->argc,
config->argv);
if (path0 == NULL) {
err = _Py_INIT_NO_MEMORY();
goto done;
}
_Py_wgetcwd() call has been introduced by the following commit:
commit ee3784594b33c72c3fdca6a71892d22f14045ab6 Author: Nick Coghlan \ncoghlan@gmail.com\ Date: Sun Mar 25 23:43:50 2018 +1000
bpo-33053: -m now adds *starting* directory to sys.path (GH-6231) (bpo-6236)
Historically, -m added the empty string as sys.path
zero, meaning it resolved imports against the current
working directory, the same way -c and the interactive
prompt do.
This changes the sys.path initialisation to add the
*starting* working directory as sys.path[0] instead,
such that changes to the working directory while the
program is running will have no effect on imports
when using the -m switch.
(cherry picked from commit d5d9e02dd3c6df06a8dd9ce75ee9b52976420a8b)
I don't think that it's correct to fail with a fatal error if the current directory no longer exist. Would it be possible to not add it to sys.path if it doesn't exist? Silently ignore the error.
@Nick: What do you think?
Omitting it from sys.path in that case makes sense to me, but I'm not sure what sys.argv[0] should be set to.
Omitting it from sys.path in that case makes sense to me, but I'm not sure what sys.argv[0] should be set to.
I propose to use ".". It would be consistent with platforms which doesn't implement realpath:
if (have_module_arg) {
#if defined(HAVE_REALPATH) || defined(MS_WINDOWS)
_Py_wgetcwd(fullpath, Py_ARRAY_LENGTH(fullpath));
argv0 = fullpath;
n = wcslen(argv0);
#else
argv0 = L".";
n = 1;
#endif
}
And it defers the error handler to later. Example of Python 3 running in a removed directory:
$ python3
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
>>> import os
>>> os.getcwd()
FileNotFoundError: [Errno 2] No such file or directory
>>> os.path.abspath('.')
Traceback (most recent call last):
File "/usr/lib64/python3.7/posixpath.py", line 383, in abspath
cwd = os.getcwd()
FileNotFoundError: [Errno 2] No such file or directory
I would prefer "." than "-m".
If we set argv0 to ".":
if (_Py_wgetcwd(fullpath, Py_ARRAY_LENGTH(fullpath))) {
argv0 = fullpath;
n = wcslen(argv0);
}
else {
argv0 = L".";
n = 1;
}
then the caller does not have any way of knowing if the returned argv0 is due to the fact that _Py_wgetcwd failed so it will blindly add "." to sys.path unless we set the result using PyObject** and then returning some error code to signal what happened (or something similar). On the other hand, I do not like this API, because returning some error code != 0 from _PyPathConfig_ComputeArgv0 would be weird because the call actually succeeded (it has returned "." as the value for argv0).
What do you think is the best way to signal pymain_run_python that the current directory does not exist (because _Py_wgetcwd has failed)?
Oh. It seems like I misunderstood the issue. I read "argv0" as sys.argv[0], but _PyPathConfig_ComputeArgv0 is used to insert a value at the start of sys.path. That's different...
If the current directory has been removed, sys.path should just be left unchanged, no?
Yup. But what is the best way to signal the caller that _PyPathConfig_ComputeArgv0 has failed because _Py_wgetcwd has failed without changing the API massively?
Right now if _PyPathConfig_ComputeArgv0 returns null is assumed that is due to a memory error when calling PyUnicode_FromWideChar. So either we stop returning _Py_INIT_NO_MEMORY() and then skip appending to sys_path or we change the API to signal different problems to the caller.
Also, notice that the same function is used in sysmodule.c in PySys_SetArgvEx:
If argv[0] is not '-c' nor '-m', prepend argv[0] to sys.path.
New changeset dcf617152e1d4c4a5e7965733928858a9c0936ca by Victor Stinner in branch 'master': bpo-36236: Handle removed cwd at Python init (GH-12424) https://github.com/python/cpython/commit/dcf617152e1d4c4a5e7965733928858a9c0936ca
Can someone please on macOS to confirm that the bug is fixed?
It seems to work:
❯ uname -a Darwin C02VL073HTDG 18.2.0 Darwin Kernel Version 18.2.0: Thu Dec 20 20:46:53 PST 2018; root:xnu-4903.241.1\~1/RELEASE_X86_64 x86_64
❯ pwd /tmp/check
/tmp/check ❯ rm -rf ../check
/tmp/check ❯ \~/github/cpython/python.exe -m pdb usage: pdb.py [-c command] ... [-m module | pyfile] [arg] ...
Debug the Python program given by pyfile. Alternatively, an executable module or package to debug can be specified using the -m switch.
Initial commands are read from .pdbrc files in your home directory and in the current directory, if they exist. Commands supplied with -c are executed after commands from .pdbrc files.
To let the script run until an exception occurs, use "-c continue". To let the script run up to a given line X in the debugged file, use "-c 'until X'".
New changeset fc96e5474a7bda1c5dec66420e4467fc9f7ca968 by Victor Stinner in branch 'master': bpo-36236: Fix _PyPathConfig_ComputeSysPath0() for empty argv (GH-12441) https://github.com/python/cpython/commit/fc96e5474a7bda1c5dec66420e4467fc9f7ca968
Pablo: is Python 3.7 affected by the issue?
Yup:
~ ❯ cd /tmp
/tmp ❯ mkdir check
/tmp ❯ cd check
/tmp/check ❯ rm -rf ../check
/tmp/check ❯ python3.7 -m pdb Fatal Python error: pymain_compute_path0: memory allocation failed ValueError: character U+e0895f00 is not in range [U+0000; U+10ffff]
Current thread 0x000000011d9405c0 (most recent call first):
New changeset f7959a9fe7e7e316899c60251e29390c4ed0307a by Victor Stinner in branch '3.7': bpo-36236: Handle removed cwd at Python init (GH-12450) https://github.com/python/cpython/commit/f7959a9fe7e7e316899c60251e29390c4ed0307a
Pablo: Oh ok, thanks for testing. I fixed 3.7 as well. Would you mind to validate my fix?
The fix for 3.7 works too: no more crashing. Thanks, everyone!
The fix for 3.7 works too: no more crashing. Thanks, everyone!
I planned to test, but I had no access to macOS last years. Thanks for testing Ned. Good to hear that the bug is now fixed in 3.7 and master.
Thanks for sorting this out, Victor!
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['OS-mac', '3.7', '3.8']
title = 'Python crash on macOS when CWD is invalid'
updated_at =
user = 'https://bugs.python.org/lkollar'
```
bugs.python.org fields:
```python
activity =
actor = 'ncoghlan'
assignee = 'none'
closed = True
closed_date =
closer = 'ned.deily'
components = ['macOS']
creation =
creator = 'lkollar'
dependencies = []
files = []
hgrepos = []
issue_num = 36236
keywords = ['patch']
message_count = 19.0
messages = ['337469', '337470', '337767', '338116', '338209', '338297', '338300', '338301', '338362', '338363', '338365', '338392', '338393', '338394', '338417', '338418', '338746', '338754', '339195']
nosy_count = 6.0
nosy_names = ['ronaldoussoren', 'ncoghlan', 'vstinner', 'ned.deily', 'lkollar', 'pablogsal']
pr_nums = ['12237', '12424', '12441', '12450']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue36236'
versions = ['Python 3.7', 'Python 3.8']
```