python / cpython

The Python programming language
https://www.python.org
Other
63.32k stars 30.31k forks source link

IDLE needs syntax highlighting for f-strings #73473

Open rhettinger opened 7 years ago

rhettinger commented 7 years ago
BPO 29287
Nosy @rhettinger, @terryjreedy

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/terryjreedy' closed_at = None created_at = labels = ['expert-IDLE', 'type-feature', '3.7'] title = 'IDLE needs syntax highlighting for f-strings' updated_at = user = 'https://github.com/rhettinger' ``` bugs.python.org fields: ```python activity = actor = 'terry.reedy' assignee = 'terry.reedy' closed = False closed_date = None closer = None components = ['IDLE'] creation = creator = 'rhettinger' dependencies = [] files = [] hgrepos = [] issue_num = 29287 keywords = [] message_count = 2.0 messages = ['285585', '286547'] nosy_count = 3.0 nosy_names = ['rhettinger', 'terry.reedy', 'peter.otten'] pr_nums = [] priority = 'normal' resolution = None stage = 'test needed' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue29287' versions = ['Python 3.7'] ```

rhettinger commented 7 years ago

Follow the lead from Vim, MagicPython, and PyCharm. Needs separate colorization to make the expression distinct from the rest of the string. Needs close-brace matching. Would be desirable to have autocompletion as well.

https://twitter.com/raymondh/status/819085176764366848

terryjreedy commented 7 years ago

This is 3 related but somewhat distinct proposals.

  1. Special handling (normal syntax colorizing) of f-expressions (the grammatical term used at https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals).

  2. Brace matching within the strings. (For normal strings, this would be nice when the string is to be used as a format string, but there is no way to know how the string will be used.)

  3. Normal identifier autocompletion (as opposed to within-string file name autocompletion).

I have two types of doubts about #1

  1. Should it be done? a. I have not yet used f-strings, so I not know if I would want uncolorized holes within them. b. Do beginners use f-strings? Would they find this more useful than confusing? c. Is this the sort of 'advanced feature' that Guido has said should *not* be copied from other editors?

  2. Can it be done sensibly within the limits of IDLE's colorizer.

IDLE's colorizer defines a giant regex that joins regexes that match keywords, builtins, comments, and strings (and newlines for synchronization). Each of the latter is a named capturing group that joins alternatives for that group. Keywords and built-in names are recognized when complete. Partial comments and strings are recognized as soon as '#' or an open quote is typed. There is a human-verified test of colorizing that could, I believe, be turned into a unit test of the re matching. This would be needed for approach b. below.

The compiled re is used in a ColorDelegator instance that is part of a chain of delegators tied to a text widget. The class code is not documented and I do not understand it well enough to modify it without adding tests. But it was not designed for easy testing.

Sidenote: There are DEBUG prints in multiple methods (but not in recolorize_main). Some messages can come from multiple methods. I should add message to the r...main method and prefix all messages with an indicator of the source so the control flow is easier to follow

I see two possible approaches to separately colorizing f-expressions within an f-string.

a. Follow the example of 'def' and 'class'. They are recognized as a special case (of builtin) and when they occur, a separate 'if' clause and re is used to colorize the following name.

The problem with doing this with f-strings is that we want to recursively apply the re...main function to a short substring, and the function is not designed for that. We also want to do this separately for each embedded f-expression. It might work to write a reduced version of recolorize_main as recolorize_fexp.

This approach would allow for {} matching once a closing quote is typed, but not identifier autocompletion.

b. Do the special-casing by writing special regexes to recognize a null f-string (no embedded f-expression), and beginning, middle, and ending string parts of an f-string. But I don't know if it is possible to write an re that will *only* match null f-strings.

That aside, the f-expression would then be treated normally, and autocomplete should just work. {} matching would be harder. Without adding new state variables, I imagine that the end quote of the invalid f"a{b" would be seen as the beginning of a new string.

Duoquadragesimal commented 2 years ago

I'm pretty much a beginner and use f-strings, I do think that adding colourisation for them would help to distinguish them in my programs. like

numthings = int(input("How many things do you have? "))
print(f"You have {numthings} thing{int(not(numthings==1 or numthings==-1))*'s'}") 

it's a bit dense, but I like doing things in as few lines as possible (maybe not good practice? idk). In this example, in IDLE everything in the curly braces in the f-string shows as green, which makes it a bit hard to like, see what's what when I'm running. I don't know anything about how hard it would be to implement. But it would be a useful feature to me. There is though, I think, the issue of like. How it looks a bit like the string has ended (looking at other editors like vscode). Maybe to get around this, f-strings would change text colour but the background colour would stay as that of a string? Then if you set a different string background you can see that it's all part of a string, but you can also see functions being executed in the braces which I would like.

terryjreedy commented 1 year ago

PEP-701 Syntactic formalization of f-strings, which will allow nested f-strings, will be in either 3.12 or later. Internally, there will no longer be an 'f-string'. Instead, the c tokenizer will spit out multiple f-strings pieces, with various labels. Module tokenizer is supposed to be updated to match. If it is, f-strings identified by IDLE could be fed into tokenizer.