eliben / pycparser

:snake: Complete C99 parser in pure Python
Other
3.26k stars 612 forks source link

Parsing strings that contain escaped double quotes #553

Closed sebastien-rosset closed 2 months ago

sebastien-rosset commented 2 months ago

It looks like there is a problem parsing strings that contains escaped double quotes, as shown in the test below.

#!/usr/bin/env python3

import unittest
from pycparser import c_parser, c_ast

class TestPycparserUnescapedQuotes(unittest.TestCase):
    def setUp(self):
        self.parser = c_parser.CParser()

    def test_string_parsing(self):
        test_cases = [
            ('char* s = "hello";', '"hello"', "Simple string"),
            ('char* s = "hello \"\"";', '"hello \"\""', "String with escaped quotes"),
            ('char* s = "hello \"world\"";', '"hello \"world\""', "String with escaped quotes"),
            ('char* s = "";', '""', "Empty string"),
        ]

        for input_string, expected_value, description in test_cases:
            with self.subTest(input=input_string):
                ast = self.parser.parse(input_string)
                init = ast.ext[0].init
                self.assertEqual(
                    init.value,
                    expected_value,
                    f"{description}: String value should be correctly parsed"
                )

if __name__ == "__main__":
    unittest.main()
$ pip  show pycparser
Name: pycparser
Version: 2.22
Summary: C parser in Python
Home-page: https://github.com/eliben/pycparser
Author: Eli Bendersky
Author-email: eliben@gmail.com
License: BSD-3-Clause
Location: /Users/serosset/git/opencpn-ux-experiments/.venv/lib/python3.8/site-packages
Requires: 
Required-by: cffi
eliben commented 2 months ago

Just from a quick look: are you sure your test escapes these strings properly?

Don't forget that Python does its own escaping when it sees \", unless you're in a raw string like r'''...

Removing this from the picture, consider:

$ cat /tmp/2.c
void foo() {
  char* s = "hello \"world\"";
}

$  python3 examples/dump_ast.py /tmp/2.c
FileAST: 
  FuncDef: 
    Decl: foo, [], [], [], []
      FuncDecl: 
        TypeDecl: foo, [], None
          IdentifierType: ['void']
    Compound: 
      Decl: s, [], [], [], []
        PtrDecl: []
          TypeDecl: s, [], None
            IdentifierType: ['char']
        Constant: string, "hello \"world\""

Looks alright, I think?

sebastien-rosset commented 2 months ago

You are right, it looks like there was an issue with escaping.