Shopify / pyoozie

Library for querying and scheduling with Apache Oozie
https://py-oozie.readthedocs.io
MIT License
11 stars 12 forks source link

Read unicode coordinator configurations #31

Closed cfournie closed 7 years ago

cfournie commented 7 years ago

This PR fixes an issue where if a coordinator configuration contains a non-ASCII character then it fails to parse the configuration with the following exception in Python 2:

self = <xml.sax.expatreader.ExpatParser instance at 0x10bec5368>, data = '
<configuration>
    <property>
        <name>key1</name>
        <value>value1</value>
    </property>
    <property>
        <name>key2</name>
        <value>😢</value>
    </property>
</configuration>
', isFinal = 0

    def feed(self, data, isFinal = 0):
        if not self._parsing:
            self.reset()
            self._parsing = 1
            self._cont_handler.startDocument()

        try:
            # The isFinal parameter is internal to the expat reader.
            # If it is set to true, expat will check validity of the entire
            # document. When feeding chunks, they are not normally final -
            # except when invoked from close.
>           self._parser.Parse(data, isFinal)
E           UnicodeEncodeError: 'ascii' codec can't encode characters in position 160-161: ordinal not in range(128)

/usr/local/opt/pyenv/versions/2.7.13/lib/python2.7/xml/sax/expatreader.py:213: UnicodeEncodeError

Fixes https://github.com/Shopify/pyoozie/issues/32