SRserves85 / avro-to-python

Light tool for compiling avro schema files (.avsc) to python classes
MIT License
25 stars 19 forks source link

Escape characters lost in AVRO schema #41

Closed fmiguelez closed 3 months ago

fmiguelez commented 3 months ago

When using an escape character in an AVRO schema, usually inside a "doc" tag they are lost when trying to access the schema string.

Consider following schema

{
    "type": "record",
    "name": "EscapedChars",
    "namespace": "records",
    "doc": "This is an example \"schema\".",
    "fields": [
        {
            "name": "name",
            "type": "string"
        }
    ]
}

And the test to check the embedded schema (in test_compiled_files.py) in generated class

    def test_escape_chars(self):
        """ tests that escape characters are not lost """

        from records import EscapedChars
        schema = r"""
    {
        "type": "record",
        "name": "EscapedChars",
        "namespace": "records",
        "doc": "This is an example \"schema\".",
        "fields": [
            {
                "name": "name",
                "type": "string"
            }
        ]
    }
    """

        self.assertEqual(
            schema,
            EscapedChars.schema,
            "Schema should contain escape characters"
        )

Such test fails with error

FAILED              [100%]
Schema should contain escape characters
('\n'
 '    {\n'
 '        "type": "record",\n'
 '        "name": "EscapedChars",\n'
 '        "namespace": "records",\n'
 '        "doc": "This is an example "schema".",\n'
 '        "fields": [\n'
 '            {\n'
 '                "name": "name",\n'
 '                "type": "string"\n'
 '            }\n'
 '        ]\n'
 '    }\n'
 '    ') != ('\n'
 '    {\n'
 '        "type": "record",\n'
 '        "name": "EscapedChars",\n'
 '        "namespace": "records",\n'
 '        "doc": "This is an example \\"schema\\".",\n'
 '        "fields": [\n'
 '            {\n'
 '                "name": "name",\n'
 '                "type": "string"\n'
 '            }\n'
 '        ]\n'
 '    }\n'
 '    ')

<Click to see difference>

self = <test_compiled_files.PathTests testMethod=test_escape_chars>

    def test_escape_chars(self):
        """ tests that escape characters are not lost """

        from records import EscapedChars
        schema = r"""
    {
        "type": "record",
        "name": "EscapedChars",
        "namespace": "records",
        "doc": "This is an example \"schema\".",
        "fields": [
            {
                "name": "name",
                "type": "string"
            }
        ]
    }
    """

>       self.assertEqual(
            schema,
            EscapedChars.schema,
            "Schema should contain escape characters"
        )

test_compiled_files.py:55: AssertionError

The failure is due to the escape character (backslash) is lost.