dawnbeen / c_formatter_42

C language formatter for 42 norminette
GNU General Public License v3.0
176 stars 17 forks source link

Fix #58 #64

Closed younesaassila closed 1 year ago

younesaassila commented 1 year ago

Fix #58 thanks to ChatGPT

Prompt You are a Python expert. Your task is to detect the catastrophic backtracking error in a regex for a C formatter program. If you cannot find the catastrophic backtracking error, it's okay to say so. Here are our regex helpers: ``` # regex for a type REGEX_TYPE = r"(?!return|goto)([a-z]+\s+)*[a-zA-Z_]\w*" # regex for a c variable/function name REGEX_NAME = r"\**[a-zA-Z_*()]\w*" # regex for a name in a declaration context (with array and function ptr) REGEX_DECL_NAME = r"\(?{name}(\[.*\])*(\)\(.*\))?".format(name=REGEX_NAME) ``` And here is the regex with the catastrophic backtracking error: ``` r"^(static\s+)?{type}\s+{name}\((.|\s)*?\);" ```

The regex you've provided indeed contains a potential catastrophic backtracking issue. The problematic part is the (.\|\s)*? section within the regex. This part can lead to catastrophic backtracking when trying to match certain inputs, especially if the input contains a long sequence of characters followed by a semicolon.

Here's why it's problematic:

To avoid this issue, you can use a more specific pattern to match the content within the parentheses in a safer way. Here's a modified version of the regex that should help avoid catastrophic backtracking:

r"^(static\s+)?{type}\s+{name}\([^)]*?\);"

In this modified regex, [^)]*? is used to match any sequence of characters except a closing parenthesis. This pattern ensures that the regex engine won't get stuck in catastrophic backtracking scenarios.

younesaassila commented 1 year ago

Do note that this only fixes the hanging. c_formatter_42 will still insert void in the comments when it sees something like bzero() (in the issue author input)

cacharle commented 1 year ago

Can you add a test with pytest-timeout?

@pytest.mark.timeout(60)
def test_foo():
    pass
younesaassila commented 1 year ago

@cacharle For the whole run_all procedure or just for line_break?

cacharle commented 1 year ago

Just for the example we know was infinite before your fix for now (so only line_break)

But I see you can set a global timeout in the setup.cfg file like so:

[pytest]
timeout = 20

https://pypi.org/project/pytest-timeout/

younesaassila commented 1 year ago

@cacharle Been trying to make it work for the last 30 minutes but it seems that pytest-timeout suffers from an issue with (at least) Python 3.10: https://github.com/pytest-dev/pytest-timeout/issues/113 The timeout function just doesn't work as it is supposed to

cacharle commented 1 year ago

Works fine for me with python3.10

c_formatter_42 ❯ pytest tests/test_run.py -k foo
========================================================== test session starts ==========================================================
platform darwin -- Python 3.10.8, pytest-7.4.0, pluggy-1.3.0
rootdir: ~/git/c_formatter_42
configfile: setup.cfg
plugins: timeout-2.1.0, cov-4.1.0, clarity-1.0.1
collected 5 items / 4 deselected / 1 selected

tests/test_run.py F                                                                                                               [100%]

=============================================================== FAILURES ================================================================
_______________________________________________________________ test_foo ________________________________________________________________

    @pytest.mark.timeout(2)
    def test_foo():
>       time.sleep(10000)
E       Failed: Timeout >2.0s

tests/test_run.py:23: Failed

---------- coverage: platform darwin, python 3.10.8-final-0 ----------
Name                                                  Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------
c_formatter_42/__init__.py                                0      0   100%
c_formatter_42/__main__.py                               26     26     0%   15-63
c_formatter_42/data/__init__.py                           0      0   100%
c_formatter_42/formatters/__init__.py                     0      0   100%
c_formatter_42/formatters/align.py                       51     41    20%   32-96, 102, 107-109
c_formatter_42/formatters/clang_format.py                28     18    36%   31-45, 65-75
c_formatter_42/formatters/helper.py                      14      7    50%   27-34
c_formatter_42/formatters/hoist.py                       20     16    20%   45-80
c_formatter_42/formatters/line_breaker.py                49     42    14%   7-9, 13-36, 59-72, 76-77, 82-90
c_formatter_42/formatters/misc.py                         9      4    56%   17, 26, 34, 42
c_formatter_42/formatters/preprocessor_directive.py      23     21     9%   17-45
c_formatter_42/formatters/return_type_single_tab.py      12      9    25%   20-35
c_formatter_42/run.py                                    19     11    42%   29-39
-----------------------------------------------------------------------------------
TOTAL                                                   251    195    22%

======================================================== short test summary info ========================================================
FAILED tests/test_run.py::test_foo - Failed: Timeout >2.0s
==================================================== 1 failed, 4 deselected in 2.25s ====================================================
$ python --version
Python 3.10.8

$ pip list | grep time
pytest-timeout        2.1.0

with the following code:

import time

@pytest.mark.timeout(2)
def test_foo():
    time.sleep(10000)
younesaassila commented 1 year ago

@cacharle Alright well perhaps it's a Windows problem then. I'll push the test

cacharle commented 1 year ago

WSL 👀👀👀👀👀

younesaassila commented 1 year ago

WSL 👀👀👀👀👀

Yeah I use WSL for 42 projects, I'm just too lazy to setup all my Python env on there lol