Open 05554c1b-08b9-4b47-8c42-47b86208a78d opened 14 years ago
Example:
---------------- mod.py ----------------
def f():
a,b,c = 1,2
print b
If i do:
>> import mod >> mod.f()
I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mod.py", line 2, in f
a,b,c = 1,2
ValueError: need more than 2 values to unpack
If i fix the source:
---------------- mod.py ----------------
def f():
a,b,c = 1,2,3
print b
And do:
>> mod.f()
I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mod.py", line 2, in f
a,b,c = 1,2,3
ValueError: need more than 2 values to unpack
The problem is that the source shown is updated, but the executed code is old, because it wasn't reloaded.
Feature request:
If the source code shown was modified after import time and it wasn't reloaded, a warning message should be shown.
Example:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mod.py", line 2, in f WARNING: Modified after import!
a,b,c = 1,2,3
ValueError: need more than 2 values to unpack
or something like that. Maybe "use reload()" might appear.
Displaying a warning whenever the code has changed on disk is clearly unacceptable - it is both expensive to check (and also, how frequent would you check?), and it would be annoying if you have long-running Python applications so this would just clutter the log files.
-1 on this request; users should just learn this aspect of Python. Also, it is not possible to get this perfect for many reasons, e.g. when you change a class, it will be impossible to update the code of existing instances even with reload.
Martin, I am not talking about a loop checking source file changes.
I think the check could be done only when the traceback is printed.
The function PyErr_Display() calls PyTracBack_Print() and so on until _Py_DisplaySourceLine() is called. The latter reads the source file and prints the line. I don't know which is the best way to know if the source file was updated. But i think that it's possible to check it there.
I think this could apply only when using the interactive console.
I worked a few hours today and I have this patch. I tried to make a test but could not.
I know this is not a really good patch, but it's my first one and I wanted to show my idea.
I do not understand what the proposed feature is. I also think it should be discussed on the python-ideas mailing list first: http://mail.python.org/listinfo/python-ideas
I'm not sure this is useful to have. If you changed your code you know that you have to reload, so why would you want a warning that tells you that you changed the code?
For some reason I always had the opposite problem (i.e. after a reload the traceback was still showing the original code, and not the new one), while IIUC you are saying that it shows the new code even if the module is not reloaded. I tried your code and indeed it does what you say, so either I am mistaken and I've been misreading the tracebacks, or this changed from 2.6 to 2.7, or in some cases even the behavior (I think) I observed might happen. I'll have to verify this next time it happens.
On Fri, Oct 7, 2011 at 5:25 PM, Ezio Melotti \report@bugs.python.org\ wrote:
Ezio Melotti \ezio.melotti@gmail.com\ added the comment:
I'm not sure this is useful to have. If you changed your code you know that you have to reload, so why would you want a warning that tells you that you changed the code?
The source line showed in the traceback could not be the same line executed.
Take a look to this example:
k.py: def f(): a,b,c = 1,2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "k.py", line 2, in f
a,b,c = 1,2
ValueError: need more than 2 values to unpack
k.py: def f(): # blah a,b = 1,2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "k.py", line 2, in f
# blah
ValueError: need more than 2 values to unpack
For some reason I always had the opposite problem (i.e. after a reload the traceback was still showing the original code, and not the new one), while IIUC you are saying that it shows the new code even if the module is not reloaded. I tried your code and indeed it does what you say, so either I am mistaken and I've been misreading the tracebacks, or this changed from 2.6 to 2.7, or in some cases even the behavior (I think) I observed might happen. I'll have to verify this next time it happens.
That is strange, I think Python does not save the original code in any place.
The source line showed in the traceback could not be the same line executed.
My point is that when I see an error and modify the source to fix it, then I know that I'll have to reload. If for some reason I forget to reload, I'll get the wrong line in the traceback and then reload, but I don't think that ever happened to me. So to me, your warning will only be useful in the case where I modified the source, forgot to reload and got the same error again with a wrong line displayed. Also note that reloading is not so common; usually you just restart your application and that will give you the right traceback.
Also I'm not sure the warning you proposed is the best way to handle this.
So to me, your warning will only be useful in the case where I modified the source, forgot to reload and got the same error again with a wrong line displayed. Also note that reloading is not so common; usually you just restart your application and that will give you the right traceback.
I know the case when this happens is really unsusual, but the interperter could be able to alert you than that line of the traceback is wrong.
Also I'm not sure the warning you proposed is the best way to handle this.
I agree that, another approach is to save a copy of the source file associated to the compiled code (like a .pys).
(just reviewing the idea here, not the current patch)
The problem of "stale code" (i.e. what was executed doesn't match what is displayed in the traceback) is a tricky and subtle one. There are a few different cases:
Aspect 3 is a much deeper (and bigger) problem relating to native introspection support in function and class objects. But that doesn't make Diego's idea to improve Aspect 2 invalid - there have certainly been times when playing at the interactive prompt that such a warning would have reminded me that I needed to reload the module I was working on.
I just want to note that the code might be edited not only while it is running it in the interactive interpreter but also while it is running as a standalone script. In this case the script naturally would not know to reload its own code nor allow the user to do it manually, and even if it would, some objects created from the old code would probably remain alive.
Here is a simple shell script that causes Python to show a stack traceback with '# This is just a comment' as the offending line by launching it in the background and then overwriting the Python code. What happened in reality is that I launched a front-end tool for video transcoding I had written in Python and after a while tweaked the code while it was still running. When the actual encode finished, my (buggy) clean-up code raised an exception and I got a traceback saying it had been raised in a line that only had a comment in it.
cat \<\<EOF >proof.py import time time.sleep(5) raise Exception EOF python proof.py & sleep 1 cat \<\<EOF >proof.py
# This is just a comment EOF sleep 5
Martin v. Löwis (loewis) wrote:
Displaying a warning whenever the code has changed on disk is clearly unacceptable
As clarified, the request is only for when a traceback is being created (or perhaps even only for when one is being printed).
I agree that we don't want to watch every file every time any code is run, but by the time a traceback is being displayed, any tight loops are ending.
Nick Coghlan (ncoghlan) wrote:
There are a few different cases: ...
- Source has been changed, but module has not been reloaded ...
- Source has been changed, module has been reloaded, but object ...
Given that a traceback is being displayed, I think it is reasonable to rerun the find-module portion of import, and verify that there is not stale byte-code.
Frankly, I think it would be worth storing a file timestamp on modules, and verifying that whatever-would-be-imported-if-imported-now matches that timestamp. This would also catch case (3).
I also think that -- on traceback display -- it might be worth verifying that the code's __globals is the __globals associated with the module of that name in sys.modules. This would warn about some intentional manipulations, but would catch case (3) even more accurately.
Storing a marker in module objects which can be used to validate the linecache is a good idea. timestamp isn't appropriate because of modules loaded from zips or dynamic generation. I'd suggest we make it something opaque - we get source code by asking the loader for it anyway - if we make all source code loading go through the loader, then file based loaders can use a timestamp, zip based ones can shortcircuit or use a reference to the containing zip's file metadata or whatever.
Oh, it may be clear to everyone already but its perhaps worth noting: there are two ways the cache can skew.
(older source): We may have a newer file compiled and in use and the older source in the cache.
e.g. someone calls linecache.getlines(foo.py), then imports foo, and in between the contents of foo.py changed.
(newer source) Or we may have an older file compiled and in use, and newer source in the cache.
e.g. someone imports foo, the contents of foo.py change, and then someone calls linecache.getlines(foo.py).
One way to address the older source case would be to make importing (also compile()) flush the imported file out of the cache. That doesn't need any special logic or handling - just a new parameter to clearcache() to specify a file (or files?) to evict.
The newer source case is what needs some logic, and for that, as I said earlier, I think we need something opaque. Perhaps since we have importlib now, if we limit ourselves to considering actual python modules (linecache can technically cache anything with lines in it) we can rely on the import machinery get_source() to handle this: if the source has changed get_source should return None.
That way we don't need to change the data structure of modules at all - only packages for which there is no 302 loader will be able to suffer cache skew.
As far as informing callers that this situation has occurred, I think we'll need to think a bit about that, as linecache already signals 'source not available' by returning None, which doesn't leave much room to include an error message. We could start raising exceptions up the stack - which implies modifying PEP-302 (e.g. via a new PEP) and working up the stack.
Oh, meant to add - we could just call logging.warning or something.
As a point of update - this is a problem that can make debugging more difficult for people running web services or other server processes that are expected to stay alive for a long time. It is often undesirable to force a server restart or reload every time a new code base is deployed, so you can get into situations where code running does not match the code on disk.
If you get a crash and a traceback in such situation, you will be very confused.
As discussed already, this could be solved by taking a hash of the source code at module import time and then comparing that to the code on disk during traceback printing time - if there is a difference, then print a big fat warning that the traceback may be unreliable.
I just lost two days tracking down weird issues that in the end were caused by server failing to reload code when it should have done so. Having such warning on the first traceback would have alerted me to the problem much earlier.
A pure python demonstration of the problem looks like this (__file__
stores the path to the executed module):
with open(__file__) as f:
src = f.read()
with open(__file__, 'w') as f:
f.write('\n\n\n\n\n# Whoops! Wrong line!\n')
f.write(src)
raise NotImplementedError('The prepended lines have confused you.')
I second what was said by Aigars Mahinovs about long-running processes giving confusing tracebacks that make debugging very difficult. I have a Natural Language Processing pipeline extracting features from a large corpus of texts, and that process can take days to complete. If the underlying modules have since been edited, then when an Exception occurs – including a KeyboardInterrupt – then the traceback shows the wrong lines. The functions listed at the end of the line are correct, which is the only reason I was able to easily detect the source of my confusion; the line number cited was no longer inside of the listed function!
I propose one more simple thing to track that would be helpful in my situation: how many lines were in the file at call time vs now. It would be (potentially) helpful to have a warning point out that the cited module is now 17 lines longer than it was when it was imported. That way I can make more intelligent guesses about what line was actually the culprit. Obviously there could have been additions and deletions, which muddies the water, but this would at least be a starting point.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['interpreter-core', 'type-feature']
title = 'Unupdated source file in traceback'
updated_at =
user = 'https://bugs.python.org/dmascialino'
```
bugs.python.org fields:
```python
activity =
actor = 'Robert Reynolds'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation =
creator = 'dmascialino'
dependencies = []
files = ['23321']
hgrepos = []
issue_num = 8087
keywords = ['patch']
message_count = 18.0
messages = ['100600', '100608', '100628', '144982', '145108', '145141', '145167', '145168', '145173', '145291', '147142', '153345', '237297', '238174', '238175', '300990', '370206', '370207']
nosy_count = 13.0
nosy_names = ['loewis', 'ncoghlan', 'rbcollins', 'jjconti', 'ezio.melotti', 'eric.araujo', 'ubershmekel', 'chortos', 'dmascialino', 'eric.snow', 'Jim.Jewett', 'Robert Reynolds', 'Aigars Mahinovs']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue8087'
versions = []
```