Open alexmojaki opened 5 years ago
Hey!
I need to make time to take a proper look, but to quickly reply to your points at the end:
Side effects in attribute lookups -- I completely agree, evaluating arbitrary attributes should at most be an opt-in config 💣
great! yeah, the line threshold is basically a crude approximation to what one really wants ("show just enough code so it makes sense"). This will be much nicer, working from semantically meaningful pieces
Ah I see, this would be more fine-grained than just noting the executed line number -- e.g. if my line is foo(bar(zap(zop())))
, we can disambiguate that we were currently executing zap
, etc. (Though in this example, you'd also see that by looking at the next deeper traceback entry.. do you have a better example?) I do like the highlighting idea!
OK so if I'm not completely mistaken this would be mostly a visual convenience feature (= you can still infer this information from the rest of the traceback, if the highlights aren't available) - in that case perhaps it'd be ok if only 'fancy mode' could indicate the active node, using color. Either way, there's still lots of time to play around with this. (One thing I wouldn't like very much though would be to sprinkle extra ascii art in people's code)
I kept the non-variable code monochrome deliberately (though keywords and operators are bold and slightly brighter), so that more of the color space is available to assign to variables -- I think combining syntactic and semantic colorization could easily become chaotic
(edit: rereading what you wrote I think I misunderstood - you meant that the color schemes have no control over the 'background' source color? But isn't the source_default
color applied to them? (grep for source_default
and default_tpl
))
Oh yes, to be clear: Collapsing multiline statements wasn't an aesthetic choice, they just caused problems in certain corner cases due to my overburdened, token-based line parsing machine. In other words, this was totally just a hack. It will be a happy day when the token machine disappears due to your work & backslashed statements will be undisturbed forever after!
Awesome, I'm glad you're so happy with my proposals.
- ... OK so if I'm not completely mistaken this would be mostly a visual convenience feature (= you can still infer this information from the rest of the traceback, if the highlights aren't available)
That is often (probably mostly) the case, although there are plenty of exceptions:
callback(x)
where callback = foo
. In this case you could still probably figure it out by looking at the variables, but it's getting harder.foo[bar]
triggers __getitem__
. For beginners this is crucial as they may not know what __getitem__
means, but apart from that there could easily be several potential __getitem__
calls in the line. In a way this is a mix of points 2 and 3.An example of all of these is if you get an IndexError
while digging into a nested list with x[i][j][k]
.
We can check for 'normal' cases defined as follows:
ast.Call
node, i.e. no magic methods.In those cases we can hide the extra info by default. But I'm not sure if this would be a good thing.
- ... you meant that the color schemes have no control over the 'background' source color?
Yes, but I wasn't clear, I was talking about within this PR. Since I'm not iterating through every token, there is no longer a place to apply default_tpl
. So those bits of code are not highlighted and they use the terminal's default color, which is typically the opposite of the background and may be configured by the user independently. If that's a problem I think I can fix it by inserting e.g. </variable><default>
where I currently just insert </variable>
.
source_lines
and source_lines_after
? I'm thinking we just leave them as they are and add a note in the docs such as "Multiline statements are treated as a single line". It's not a complete explanation but I think it's close enough and users don't need to know the exact details.source_lines=1
to actually potentially show multiple lines, especially in the summary traceback?source_lines_after
is currently not passed as an argument anywhere. Is that intentional? ranges = [
Range(
node.first_token.start[1],
node.last_token.end[1],
(variable, node),
)
for variable, node in self.frame_info.variables_by_lineno[self.lineno]
]
would become something like:
ranges = [
Range(
(...)
)
for variable, node in self.frame_info.variables_by_lineno[self.lineno]
]
Currently the default is to display at most 6 lines (including the (...)
if needed) per piece, except for the currently executing piece which is never truncated. Would you like to let users configure this? If so give me a name and some documentation that you think fits your library and intended audience.
I just pushed an update to stack_data
which relies on an update to executing
, so make sure you upgrade that.
OK, I've created https://github.com/alexmojaki/pure_eval, so pip install -e
that as well and of course update stack_data. So this PR now has safe attribute and subscript inspection.
There's not that much left for me to implement. I'm very keen to hear your responses to the questions above, particularly the first 3 here.
Any progress on this ?
Hey @luzpaz, the current blocker is that I've asked @cknd a bunch of questions to which I'm awaiting a response.
I'm curious, what's your interest in this PR?
@alexmojaki i'm looking forward to the perks this PR introduces per https://github.com/cknd/stackprinter/pull/23#issue-312983686
Hey all! I haven't forgotten about this, I just found it surprisingly difficult to find a free weekend lately. I'll need your patience a bit longer
Hi @cknd, that's fine, just remember that for now you only need to look at the conversation, not the code changes. I still have plenty more to do.
Hey, apologies for the delays. To pick up this thread again:
source_lines
and source_lines_after
parameters: I'd just remove the source_lines_after
parameter and let source_lines
control the maximum nr of lines printed for one frame (perhaps renaming it to max_lines
). If we still specify a number of lines, then I don't understand what's going to change and how the division into pieces (logical blocks) will be used. Can you elaborate, maybe with examples?
Anyway, in the meantime stack_data and pure_eval have been pretty fully developed. They have complete tests and documentation and are available on PyPI. Hopefully soon they will be integrated into IPython. I suggest you read through the stack_data README.
By the way, the new dependencies do not and will not support Python 3.4, which is well past its EOL, so this PR is dropping it.
If we still specify a number of lines, then I don't understand what's going to change and how the division into pieces (logical blocks) will be used. Can you elaborate, maybe with examples?
Hm, it seems I'm still lacking intuitions how the new stack_data block snipper behaves on real code. I need to play with it some more. Roughly, my thought is: Since the blocks can have varying sizes (1-6 lines with the current default afaik), then for a given maximum nr of blocks to be printed, the resulting printout can vary in length a lot, depending on the nesting structure of the code and the size of the nested blocks. But since the main motive for printing less than the whole scope is to save screen space, any (future) config to limit printing should have a mostly predictable effect on the space that will be used.
I'll play with the new snipping mechanism and see how I can map it to this underlying motive, i.e.: Given a certain quantifiable desire to save screen space, how can we select the parts of code most likely to help with debugging. I have a hunch that logical blocks are part of the answer.
(random idea that disregards implementation effort: One way to reconcile block selection and a configurable output size might be to treat it as a bin packing problem, packing a number of logical blocks that fits the configured size of printout, prioritized by expected usefulness (starting with the executed line, then outer blocks that explain the control flow to the executed line, and previous occurrences of the variables in the lines printed so far, etc.) /random idea that disregards implementation effort)
As discussed in #22
This is still very much a WIP, but you can run the demos and see output that looks right until you inspect it more closely. I'm putting some work out early so you can see what's coming and we can start some conversations. To try it out, clone https://github.com/alexmojaki/stack_data and
pip install -e <path to folder>
in your interpreter where you work on stackprinter.A couple of perks you can already observe:
__qualname__
thanks toexecuting
.It shows something like this:
The source line
1 / 0
and its context are absent. This is now fixed.Things I plan on doing soon which should be quite quick and easy:
Changes we need to discuss:
pure_eval
which accepts an AST node and returns a bunch of sub-expressions which can be safely evaluated without triggering any side effects and their corresponding values. This will include for example attributes which are simply present in__dict__
(no properties) or subscriptsa[b]
wherea
has a known type like a list or dict. I think it's dangerous to evaluate arbitrary attributes like you're doing now because this risks mutating the state of the program which could interfere with someone's debugging.better-exceptions
has a similar view - they are currently trying to integrate the display of attributes but only usinggetattr_static
. With this in mind there would no longer be such a thing as UnresolvedAttribute.if
condition is a single piece. Context is measured in numbers of pieces, e.g. the default is to include 3 pieces before and 1 piece after. This may be more than 5 lines total but the advantage is that tracebacks don't truncate statements or other groups of lines that should logically be together. However if you have a very long piece then it may be truncated in the middle to avoid arbitrarily long tracebacks.executing
library. The simplest way is to add a line such asWhile calling: foo()
to each frame. This is probably the only decent option when there's no color. With color, there's the possibility of highlighting the expression similar to what heartrate does. However given that the expression may contain multiple variables with random colors this will probably be hard to do in a way that's reliably readable.