Open terryjreedy opened 8 years ago
In the past few months, there has been an increase "IDLE fails to start" reports that are apparently due to shadowing of stdlib modules by user modules in the same directly as as user code. For instance, a beginner creates turtle.py and then writes "import turtle" in the same file or another in the same directory.
Note that only Python-coded modules can be shadowed. Imports of builtin modules that are not accelerators for a .py module are handled internally and do not involve sys.path. (If turtle were builtin, turtle.py would not interfere with 'import turtle.) In particular, sys is builtin and 'import sys' is guaranteed to work and give access to sys.path.
CPython avoids shadowing of the stdlib in its own imports. Perhaps it does not add '', standing for the current working directory, to the front of sys.path until after its imports are done. A self-contained application can avoid the problem by properly naming its own modules. An application like IDLE that runs user code needs to protect itself from user files. It does not now.
IDLE normally runs with two processes. I believe shadowing is more a problem for the user process that executes user code, with the working directory set to that of the user code. (I believe that this is one reason why user code must be saved, to a particular directory, before it is run.) Since, with one exception, all imports in run.py are done before running user code, I believe a sufficient fix would be temporarily delete '' while run.py does its own imports.
By default, the IDLE gui process should not have a problem. The working directory is that of python.exe, which initially has no conflicting files. To guard against user additions, '' could be permanently removed. However, for option -n single-process mode, it would have to be restored (a reason for wanting to remove that mode). I am not sure of the effect of all the other options. At present, I believe, all imports are done at startup. So the same idea used for the user process might work, at least for the present.
Automated tests would be nice. I might want to some day limit initial imports to the minimum needed to show the initial window (or windows), so IDLE might start faster. But they are not trivial. And there is the time factor. A single cold start of IDLE takes as long (about 4 seconds on my machine) as the average test file in the test suite.
Looking at https://stackoverflow.com/questions/15888186/cant-run-python-via-idle-from-explorer-2013-idles-subprocess-didnt-make-c and https://stackoverflow.com/questions/874757/python-idle-subprocess-error it seems that people have put things like tkinter.py, random.py, and subprocess.py in the python directory, along side of python.exe. So the IDLE process definitely also needs a patch.
This bug should be fixed urgently. Over at python-help we just got another Python beginner who was using IDLE and then started to get the "IDLE's subprocess didn't make connection" error. The problem was that the user had named one of their programs random.py. That's a bad introduction to Python and programming.
I will make this one of my priority issues once I get up to speed with the new workflow. However, it is not a trivial issue, and I am not sure that all my beliefs of a year ago were correct. What I really do not want to do is break IDLE running when users do not make a mistake.
Do you have access to IDLE on any system other than Windows?
From the viewpoint of IDLE and masking, the stdlib has three parts: idlelib, other modules that IDLE imports, and other modules that IDLE does not import. As I already noted, this division is different for IDLE and user processes.
IDLE may delay importing some idlelib modules, and I intend to do this more if possible. But I am not going to worry about someone creating an idlelib directory with duplicate names. If someone does this, too bad.
Modules that IDLE does not import are not a concern for the operation of IDLE. If a *user* program masks an stdlib module that the program intends to import, it is not IDLE's direct concern.
This issue is about other modules that IDLE *does* import -- directly *or* indirectly. IDLE does not import 'random' -- the word does not appear in idlelib. On the other hand, in 3.6.1,
>>> import sys
>>> 'random' in sys.modules
True
For a user random.py to block IDLE startup, it must be that some module that imports it accesses some attribute during import, possibly as part of a 'from' import.
In order to make new code unit-testable, it should be put in a new 'fix_path' function, with a detailed specification.
Do you have access to IDLE on any system other than Windows?
I don't have a Windows machine at all.
For what it's worth, here's the behavior I'm talking about, albeit with an old version of Python:
$ mkdir testidle
$ cd testidle
$ /usr/local/bin/python /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/idlelib/idle.py # Works
$ >random.py
$ /usr/local/bin/python /Library/Frameworks/Python.framework/Versions/Current/lib/python2.7/idlelib/idle.py
[. . .]
Unhandled server exception!
Thread: SockThread
[. . .]
import tempfile
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/tempfile.py", line 34, in <module>
from random import Random as _Random
ImportError: cannot import name Random
*** Unrecoverable, server exiting!
Of course no one starts IDLE from a command line in real life and so a beginner gets send down a blind alley with a misleading error message about firewall software.
I and others constantly tell people to start IDLE from the command line when they have a problem. Today, someone actually followed the advice and the traceback revealed that the problem was a personal tkinter.py masking the stdlib file! Renaming it to mytkinter.py fixed the problem. This tell me that 'fixing' sys.path for the idle process must be done immediately after importing sys and before importing tkinter. In the user process, the sys import should again be first, followed by the fixup.
I have verified that a user 'sys.py' does not mask sys imports. Python itself imports sys, so IDLE's 'import sys' just gets the existing sys.modules['sys'].
Because sys module is correctly imported, we can modify sys.path to change the import behave.
Add new PR 1364 for this.
This add a new private function _fix_import_path
that will remove the local import path ''
, when running IDLE from command line, it will recover the import path at the end of the argparse part. So no need to worry about user to import other local modules.
That also mean, if our user has the same name file, they will not be able to import it anymore. (cause IDLE import it first).
Thank you for making a start on code for this issue. After thinking about this a few hours, here is what I currently think.
I don't want a tiny module for one function. I would like a module with objects common to the idle and user processes, in particular things needed by both pyshell and run modules. This would include things now imported into pyshell from run. (These were initially imported in the other direction, but switching sped up run startup.) I would like to refactor and include the near duplicate warning functions. The imports of io, linecache, and maybe warnings in the new module would come after the new function is called in the module. At least initially, I would not include anything dependent on another IDLE module.
Possible names: runutil, runcom, runshell, ??? (and test_...).
Call the new function fix_path (nearly everything in idlelib is private already). Add "def unfix_path(): if [path[0] != '': \<insert ''>. It is possible that we will need to return and pass back what, if anything, is removed. I am not sure yet.
When one runs 'python \<path>/file.py', sys.path[0] is \<path>, not ''. So one can start idlelib with, for instance, "python \<path-to_idlelib>/idle.py" and sys.path[0] will be \<idledir>. It slows imports slightly and could be removed (and never re-added). I need to check for other instances of sys.path[0] != '' and \<idledir> additions.
Here is one. Add "print(sys.path, file=sys.__stdout__)" to run.py and run it and the result in Shell is ['C:\\Programs\\Python36\\lib\\idlelib', 'C:\\Programs\\Python36\\Lib\\idlelib', 'C:\\Programs\\Python36\\python36.zip', ...
If I start IDLE from the console, path (in console) starts with '' instead of idledir x 2. It appears twice because of the tkinter input. The original sys.path is the same in both processes. If I then run a file with 'print sys.path', user-process sys.path begins with filedir, consoledir. I am not quite how '' is replaced by consoledir, but the presence of consoledir is wrong (explained elsewhere).
If I start from the icon, output never appears. The result needs to be saved and displayed otherwise, such as in a tkinter messagebox. If I then run the same file, path (displayed in Shell) begins with just the file directory.
More experiments are needed, such as starting from Explorer [Edit with IDLE]. We will need some repeated on Linux and MAC
In run.py, would "unfix()" be needed after imports? I think not when running a file, because pyshell.ModifiedInterpreter.prepend_syspath and MI.run_command together prepend usercodedir to sys.path. I believe this is true when running a startup file. So we only need to be concerned about restoring '' (or anything else deleted) sometime before running user code from the shell.
Within run, all but one of the delayed non-main imports are either duplicates that can be removed or can be moved to the top, where they will surely run before sys.path is augmented.
The un-movable delayed pydoc import is in the handle method of run.MyHandler(RPCHandler(socketserver.BaseRequestHandler)) and is called in the base .__init__. This must happen *before* run_command sends the rpc request to modify sys.path. So it should be safe if we do not previously unfix sys.path.
My conclusion: We may sometimes need to restore something so imports in Shell work the same as in Python REPL. The last possibility is in run.main just before 'while 1:'.
Importing in the IDLE process and pyshell are messy. As a temporary partial fix, we could unfix() after the top imports in pyshell, or after the top imports in pyshell.main, or at the bottom of pyshell.main just before entering the loop. But if there are any further delayed non-idlelib imports, the IDLE process remains vulnerable to shadow files. But I don't yet know if this is possible.
As with run, do we need to explicitly unfix in pyshell? In normal mode, no user code runs in the idle process. So unfix is not needed. With -n set, I think unfix is still not needed for user files, as MI.run_command runs the same userdir prepend in self.locals, where user code is exec'ed. Indeed, in order to protect all IDLE imports (in the IDLE process), sys.path should be left reduced *except while running user code in the IDLE process, in -n mode*.
The irreducible problem for -n is that once user code *is* run, the fact that IDLE simulates python -i and the fact that -n means no Shell restart, the directory cannot be removed. If there is a conflict, we cannot help it. The oddity of -n mode is that one can run additional files from other directories and make sys.path grow indefinitely. Thus files can interfere with each other as well as with IDLE. In python, one can run dir1/file1.py with -i and 'exec(open(dir2/file2.py).read)', and so on, but exec does not prepend dir2 to sys.path.
What about the pyshell.main section "# process sys.argv and sys.path"? I think the sys.path parts are wrong, at least for protecting IDLE imports, and should be removed.
for i in range(len(sys.path)):
sys.path[i] = os.path.abspath(sys.path[i])
Everything is already absolute except for '' and changing the '' added by python to abspath('') == getcwd() seems wrong. It makes the IDLE shell unnecessarily different from the Python REPL. If I run path-a> python -i path-b/tem.py I cannot import files in path-a. The same should be true if I run path-a> python -m idlelib and then edit and run path-b/tem.py.
elif args:
enable_edit = True
pathx = []
for filename in args:
pathx.append(os.path.dirname(filename))
for dir in pathx:
dir = os.path.abspath(dir)
if not dir in sys.path:
sys.path.insert(0, dir)
These insertions should only happen in -n mode and will happen if and when run, not before being opened to be edited (which could even fail).
else:
dir = os.getcwd()
if dir not in sys.path:
sys.path.insert(0, dir)
With '' changed (wrongly) to getcwd, this will usually do nothing. If I run 'python', sys.path start with '', not os.getcwd. If I os.chdir(newdir), then '' refers to newdir, not the startig dir. However, in -n mode
Conclusion: delete the above and add if not use_subprocess (and other conditions?): sys.path.insert(0, '') at the last possible moment, just before
I'm very willing to believe that I'm missing something critical here, but wouldn't it be satisfactory to just move "" from the beginning to the end of sys.path in IDLE?
I can imagine that it might cause a few problems for users, but it is simple and the problems would be easy to explain.
IDLE is commonly used by beginners and over at python-help I would not relish the prospect of saying, "With IDLE, the import rules change on account of various conditions...." I don't think that would be much better than the current state.
What I want from this issue , from a user perspective, is a) for IDLE to run without stumbling over user files, and b) for user code to see the same sys.path when executing under IDLE as when executed directly with the same cpython binary in the same mode.
a) for IDLE to run without stumbling over user files b) for user code to see the same sys.path when executing under IDLE as when executed directly with the same cpython binary in the same mode.
If you can achieve that, no one will be more impressed than I am. I would take the easy way out.
bpo-25488 and bpo-26143 address related issues of sys.path as seen by IDLE and user code. bpo-26143 focuses on what IDLE sees, and user modules shadowing IDLE modules, bpo-25488 focuses on what user code sees, and the presence of idlelib in sys.path making buggy code work. It includes startup information relevant to bpo-26143 also. Both should be solved by patching IDLE in same place.
I was CC-ed on a response to a user who reported module shadowing as a security issue. If it is disabled by default, we could add an option to enable for intentional experiments. See idle.py.
I believe standard interpreter only adds '' to path after its startup. But it might be vulnerable to shadowing of delayed imports.
"IDLE does not start if threading.py, etc, in same folder as my script." #103967 (closing as not bug and duplicate.) In normal mode, IDLE process could cd to python.exe dir co delayed imports work. Then make sure run imports, including indirects from imported modules, are done on run start (no delayed imports in anything called later -- might be difficult). Start with less than perfection.
For -n, add cautions to deprecation note.
Put common functions in util. Note that any leaf import from idlelib is a possible candidate for util.py. Other things would have to be recursively checked for delayed imports.
I've had this same sort of issue but sort of in reverse at a few points, when with the way IDLE adds the user's code to sys.path
, my IDLE extension was attempting to import a library I was in the process of making a pull request for instead of the system installation of said library. To get around it I made a context manager that temporarily removes items from sys.path
and restores them once the with block is done.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/terryjreedy' closed_at = None created_at =
labels = ['expert-IDLE', 'type-bug', '3.10']
title = "Ensure that IDLE's stdlib imports are from the stdlib"
updated_at =
user = 'https://github.com/terryjreedy'
```
bugs.python.org fields:
```python
activity =
actor = 'terry.reedy'
assignee = 'terry.reedy'
closed = False
closed_date = None
closer = None
components = ['IDLE']
creation =
creator = 'terry.reedy'
dependencies = []
files = []
hgrepos = []
issue_num = 26143
keywords = ['patch']
message_count = 15.0
messages = ['258496', '258499', '292130', '292159', '292160', '292169', '292546', '292632', '293679', '293723', '293728', '293741', '296367', '404180', '404181']
nosy_count = 3.0
nosy_names = ['terry.reedy', 'mdcowles', 'louielu']
pr_nums = ['1364']
priority = 'normal'
resolution = None
stage = 'test needed'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue26143'
versions = ['Python 3.10']
```