Textualize / rich

Rich is a Python library for rich text and beautiful formatting in the terminal.
https://rich.readthedocs.io/en/latest/
MIT License
48.13k stars 1.69k forks source link

[BUG] Number highlight regex does not support separators and binary forms. #3318

Open hardkrash opened 3 months ago

hardkrash commented 3 months ago

Describe the bug

The default highlighting for numbers in rich does not support separators in numbers or binary formatting. Here is an example of printing numbers with rich.print

rich.print(f"{0x1234:#08x} {0x123456:#08_x} {0b110011001100:#b} {0b111110100101:#_b} {1234567890:,d} {123456789:d}")

image

Here we see that the hex is colored up to first occurrence of underscore _, decimal is colored in-between the commas , and not colored when the separator is underscore _, binary numbers are not colored as well.

Perhaps the following regex for number.

r"(?P<number>(?:0[xX][0-9a-fA-F_]+\b)|(?:0b[01_]+\b)|(?:(?<!\w)\-?(?:(?:(?:[0-9]{1,2}[,_])?(?:[0-9]{3}[,_])*(?<=[,_])[0-9]{3})|[0-9]+)(?:\.?[0-9]*(?:e[-+]?\d+)?)?\b))"

With possible enhancements to make the groups for hex and binary only accept groups of 4 like the decimal does 3 now. Another enhancement would be to allow internationalization of the . and , in the decimal.

There minor fix for the regex is that it no longer has an extra unnamed match on the exponent of a floating point number. see re.findall output below.

I'm sure that I may have missed a nuance from the positive lookbehind assertion on the match.

[ins] In [30]: proposed_re = r"(?P<number>(?:0[xX][0-9a-fA-F_]+\b)|(?:0b[01_]+\b)|(?:(?<!\w)\-?(?:(?:(?:[0-9]{1,2}[,_])?(?:[0-9]{3}[,_])*(?<=[,_])[0-9]{3})|[0-9]+)(
          ...: ?:\.?[0-9]*(?:e[-+]?\d+)?)?\b))"
Out[30]: '(?P<number>(?:0[xX][0-9a-fA-F_]+\\b)|(?:0b[01_]+\\b)|(?:(?<!\\w)\\-?(?:(?:(?:[0-9]{1,2}[,_])?(?:[0-9]{3}[,_])*(?<=[,_])[0-9]{3})|[0-9]+)(?:\\.?[0-9]*(?:e[-+]?\\d+)?)?\\b))'

[ins] In [31]: current_re = r"(?P<number>(?<!\w)\-?[0-9]+\.?[0-9]*(e[-+]?\d+?)?\b|0x[0-9a-fA-F]*)"
Out[31]: '(?P<number>(?<!\\w)\\-?[0-9]+\\.?[0-9]*(e[-+]?\\d+?)?\\b|0x[0-9a-fA-F]*)'

[ins] In [32]: string = f"{0x1234:#08x} {0x123456:#08_x} {0b110011001100:#b} {0b111110100101:#_b} {1234567890:,d} {1234567890:_d} {123456789:d} {-123234.654321:f} {
          ...: -123234.654321:,f} {12345.345e7:g} 123.456  123 1 12 -123 -123.45"
Out[32]: '0x001234 0x12_3456 0b110011001100 0b1111_1010_0101 1,234,567,890 1_234_567_890 123456789 -123234.654321 -123,234.654321 1.23453e+11 123.456  123 1 12 -123 -123.45'

[ins] In [33]: re.findall(current_re, string)
Out[33]:
[('0x001234', ''),
 ('0x12', ''),
 ('1', ''),
 ('234', ''),
 ('567', ''),
 ('890', ''),
 ('123456789', ''),
 ('-123234.654321', ''),
 ('-123', ''),
 ('234.654321', ''),
 ('1.23453e+11', 'e+11'),
 ('123.456', ''),
 ('123', ''),
 ('1', ''),
 ('12', ''),
 ('-123', ''),
 ('-123.45', '')]

[ins] In [34]: re.findall(proposed_re, string)
Out[34]:
['0x001234',
 '0x12_3456',
 '0b110011001100',
 '0b1111_1010_0101',
 '1,234,567,890',
 '1_234_567_890',
 '123456789',
 '-123234.654321',
 '-123,234.654321',
 '1.23453e+11',
 '123.456',
 '123',
 '1',
 '12',
 '-123',
 '-123.45']

Platform

Click to expand What platform (Mac) are you running on? Mac OS 14 What terminal software are you using? iTerm2 I may ask you to copy and paste the output of the following commands. It may save some time if you do it now. If you're using Rich in a terminal: ``` ╭───────────────────────── ─────────────────────────╮ │ A high level console interface. │ │ │ │ ╭──────────────────────────────────────────────────────────────────────────────╮ │ │ │ │ │ │ ╰──────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ color_system = 'truecolor' │ │ encoding = 'utf-8' │ │ file = <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> │ │ height = 39 │ │ is_alt_screen = False │ │ is_dumb_terminal = False │ │ is_interactive = True │ │ is_jupyter = False │ │ is_terminal = True │ │ legacy_windows = False │ │ no_color = False │ │ options = ConsoleOptions( │ │ size=ConsoleDimensions(width=164, height=39), │ │ legacy_windows=False, │ │ min_width=1, │ │ max_width=164, │ │ is_terminal=True, │ │ encoding='utf-8', │ │ max_height=39, │ │ justify=None, │ │ overflow=None, │ │ no_wrap=False, │ │ highlight=None, │ │ markup=None, │ │ height=None │ │ ) │ │ quiet = False │ │ record = False │ │ safe_box = True │ │ size = ConsoleDimensions(width=164, height=39) │ │ soft_wrap = False │ │ stderr = False │ │ style = None │ │ tab_size = 8 │ │ width = 164 │ ╰──────────────────────────────────────────────────────────────────────────────────╯ ╭─── ────╮ │ Windows features available. │ │ │ │ ╭───────────────────────────────────────────────────╮ │ │ │ WindowsConsoleFeatures(vt=False, truecolor=False) │ │ │ ╰───────────────────────────────────────────────────╯ │ │ │ │ truecolor = False │ │ vt = False │ ╰───────────────────────────────────────────────────────╯ ╭────── Environment Variables ───────╮ │ { │ │ 'TERM': 'xterm-256color', │ │ 'COLORTERM': 'truecolor', │ │ 'CLICOLOR': None, │ │ 'NO_COLOR': None, │ │ 'TERM_PROGRAM': 'iTerm.app', │ │ 'COLUMNS': None, │ │ 'LINES': None, │ │ 'JUPYTER_COLUMNS': None, │ │ 'JUPYTER_LINES': None, │ │ 'JPY_PARENT_PID': None, │ │ 'VSCODE_VERBOSE_LOGGING': None │ │ } │ ╰────────────────────────────────────╯ platform="Darwin" rich==13.7.0 ```
github-actions[bot] commented 3 months ago

We found the following entry in the FAQ which you may find helpful:

Feel free to close this issue if you found an answer in the FAQ. Otherwise, please give us a little time to review.

This is an automated reply, generated by FAQtory