Open 837960b1-4673-46a7-b405-5249f83f70cf opened 4 years ago
answers_field_order=sorted(
set(j for i in data['items'] for j in i),
key=cmp_to_key(lambda x,y:(
-1 if (x,y) in answer_order
else (0 if x==y else 1)))
)
when the cursor is placed in line 5 col 31 (between )
and ))
) hitting Ctrl-0 (Show surrounding parens) produces an error sound, though there's no error, the script works fine.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
$ uname -a
Linux odd-one 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ python3.8
Python 3.8.0 (default, Oct 28 2019, 16:14:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib
>>> importlib.util.find_spec('idlelib.pyshell')
ModuleSpec(name='idlelib.pyshell', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f1ea8ee14c0>, origin='/usr/lib/python3.8/idlelib/pyshell.py')
>>> exit()
$ dpkg-query -S /usr/lib/python3.8/idlelib/pyshell.py
idle-python3.8: /usr/lib/python3.8/idlelib/pyshell.py
$ dpkg -l | grep idle-python3\.8
ii idle-python3.8 3.8.0-3~18.04 all IDE for Python (v3.8) using Tkinter
A minimal example of the same issue seems to be:
(1 if x
else 0)
In this case the parentheses matching also fails. It only ever seems to fail when else
is the first word on a newline inside brackets, and adding the line break elsewhere in the above statement will not cause the matching to fail.
I imagine it stops looking for the matching parentheses when it gets to the else
statement, as an else
when part of an if/else
cannot be inside parentheses. Obviously the exception to this is the ternary operator.
Bizarrely, it only seems to fail the first time it encounters this issue. Any subsequent code with the same pattern seems to be matched fine. So in the following example, the second brackets get matched without an issue:
(1 if x
else 0)
(1 if x
else 0)
Hopefully someone more familiar with the source code can shed some light on this!
(While I wrote the following, Lewis Ball posted some of the same conclusions.)
I verified the failure with 3.8.5 and current master.
I reduced the failing example to a minimum of 7 chars on 2 lines. ( else)
1a. Deleting '\n' or any letter of 'else' fixes the problem. 1b. Adding 'else' before '(' fixes the problem. 1a is true for the initial 'else' on line 4 of the original example. 1b is true for 'else' on a new line before the first.
2c. Replacing 'else' with 'elif', 'while', 'def', or 'class' makes no difference. This include the addition for 1b above. Replacing 'else' with 'if', 'for', or 'with' fixes the bug.
Bingo!!! Matching uses idlelib.hyperparser, which uses idlelib.pyparse, which has regex _synchre, which finds "what looks like the start of a popular statement". It matches 'elif', etcetera, but not 'if', etcetera. I noticed previously that these 'popular' line beginners were missing and have planned to add them.* But we need to make matched words not disable fence matching.
2d. In the original and minimal examples, moving the offending 'else' to the end of the previous line fixes the problem. So I believe that 'else' at the beginning of the line is seen marking the beginning of the statememt, and matching does not look back further. If 'else' and 'yield' were left in _synchre and 'if' and 'for' were added back, the matcher would have to determine whether the line is a continuation or not. (And I believe that not having 'if' and 'for' degrades the use of pyparse at least for other uses.)
2e. The minimal failing example works in the shell. Why are _synchre keywords not poisonous here? Is it just because the current statement is known to begin just after the '>>> ' prompt, so that else beginning the second line is somehow not seen as beginning a new statement? We need to check if and how fence matching in shell is different other than the presence of the marker where code input begins.
Alexey: in (https://github.com/python/cpython/issues/65955#issuecomment-1093657458) I claimed that there is no issue with typing closers or ^0 on the same line just before the closer. Thank you for reporting this counter-example. I imagine that you are not the first IDLE user to write a conditional expression on 2 or more lines with else at the beginning of a line. Such expressions always need a closer.
Without rereading hyperparser and pyparse, I don't understand why a previous _synchre match prevents subsequent failures. It does suggest that when matching parentheses, a fix would be to temporarily prepend 'else\n' to the text, or otherwise simulating the effect of that.
But the previous 'else' might only work before the 'current' statement. And we should determine whether any other uses of these modules have related bugs. Also, removing 'else' from the _synchre regex did not fix the issue, so there is something else I am missing.
Lewis, removing you was an accident. Your post is correct.
So it looks like pyparse.Parser.find_good_parse_start
is responsible for truncating the code to only look for the matching bracket in the current statement, which uses _synchre.
Testing it out, it sometimes will go to the most recent match of _synchre, but will sometimes go back to an even earlier match in the code, which is why something like
else
(
else)
manages to match without an issue. The find_good_parse_start
will truncate at the first else
in this case, instead at the second one.
Removing else
from the _synchre regex did solve this problem for me though, as find_good_parse_start
will then try to truncate even earlier when looking for the start of the statement, which will be before the opener. Although, I haven't checked what else would be affected by this change. I am not sure why this worked for me and did not work for you Terry.
Also, even with this fix it seems like 'find_good_parse_start` goes back further than it needs to. Fixing that may not change much, but would offer a slight performance increase.
Okay, on further examination find_good_parse_start
first looks for a line ending with ':\n', and then looks to match _synchre on that line. If it can't find any line ending in ':\n' then it will just look for the first occurrence of something in _synchre to truncate on. If it can't find anything at that point, it just doesn't truncate and the whole code is checked for a bracket match.
So the reason that
else
(
else)
works is because there is no `:` after the first else, and adding `:` will cause the matching to fail.
This seems reasonable, and I was probably too quick to say that the function goes back further than it needs to when looking to truncate.
It seems like then that else
being removed from _synchre will fix this, and mean that the only time brackets aren't matched are when invalid code is typed, something like:
(
def)
I can put in a PR for this tomorrow (probably removing yield
at the same time) if this sounds like the right fix.
P.S. Here is an example of the similar yield
error that Terry alluded to:
def f():
for i in range(10):
(
yield x)
As I at least hinted above, I would rather add the missing line starts than delete more. I am quite sure that their absence degrades overall performance in some sense. A much bigger match problem that this one is that ^0 always half fails when the cursor is not on the line with the closer. See bpo-21756, which already has a patch, though not one I was completely happy with last time I looked.
I cannot help but think that a proper solution might fix both issues.
I once wrote a fence matcher in C, initially for C, based on a deterministic finite state machine but with a push-down stack so it could match indefinite nesting. I am thinking that I should try to rewrite in python and see if it will solve both issues. Find good parse start is not really the right tool, because what we want to do with ^0 is to move both back and forward, find the first unmatched fence in each direction, and flash them, and add a beep if unmatched. BOF and EOF (begin/end of file) are the backup fences.
Another possible solution might be to use a depleted _synchre for matching and an augmented one for any other use where that is better.
Okay that makes sense. Removing things from _synchre would slow down the matching slightly, although the matching still seems plenty fast enough when done inside an if clause at the moment, so I'm not sure how noticeable the removal of `else` would be. One thing could be to search for `else:` instead of just `else` in _synchre, as only the former indicates the start of a new statement (although I guess it would actually have to check for `else[\s\\]*:` or something).
Like you say though, if there are other more pressing issues with the matching then maybe it is worth fixing them all at the same time. Happy to help with that if needed.
I reproduced the issue with 3.13.0rc2 on macOS and main on Windows. (\nelse)
(with auto indent added) must be the first pair typed since opening the window. Previous text loaded on opening a existing file does not count.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/terryjreedy' closed_at = None created_at =
labels = ['expert-IDLE', 'type-bug', '3.8', '3.9', '3.10']
title = 'IDLE fails to detect corresponding opening parenthesis'
updated_at =
user = 'https://bugs.python.org/AlexeyBurdin'
```
bugs.python.org fields:
```python
activity =
actor = 'Lewis Ball'
assignee = 'terry.reedy'
closed = False
closed_date = None
closer = None
components = ['IDLE']
creation =
creator = 'Alexey Burdin'
dependencies = []
files = []
hgrepos = []
issue_num = 41388
keywords = []
message_count = 9.0
messages = ['374205', '374218', '374224', '374225', '374226', '374274', '374298', '374301', '374350']
nosy_count = 4.0
nosy_names = ['terry.reedy', 'taleinat', 'Alexey Burdin', 'Lewis Ball']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'test needed'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue41388'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']
```