UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 35: ordinal not in range(128)

fmariluis commented 8 years ago

Hi,

When parsing a docstring that includes a UTF-8 encoded character, Green fails with UnicodeDecodeError.

For example:

# -*- coding: utf-8 -*-

import unittest

class TestGrammar(unittest.TestCase):

    def test_one(self):
        """
        This works
        """
        one = 1
        two = 1
        self.assertEqual(one, two)

if __name__ == '__main__':
    unittest.main()

Works fine.

But:

# -*- coding: utf-8 -*-

import unittest

class TestGrammar(unittest.TestCase):

    def test_one(self):
        """
        Ésto va a fallar.
        """
        one = 1
        two = 1
        self.assertEqual(one, two)

if __name__ == '__main__':
    unittest.main()

Fails with:

Traceback (most recent call last):
  File "/home/franco/.virtualenvs/verde/bin/green", line 9, in <module>
    load_entry_point('green==2.2.0', 'console_scripts', 'green')()
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/cmdline.py", line 75, in main
    result = run(test_suite, stream, args, testing)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/runner.py", line 92, in run
    targets = [(target, manager.Queue()) for target in toParallelTargets(suite, args.targets)]
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 60, in toParallelTargets
    proto_test_list = toProtoTestList(suite)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 45, in toProtoTestList
    toProtoTestList(i, test_list, doing_completions)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 45, in toProtoTestList
    toProtoTestList(i, test_list, doing_completions)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 45, in toProtoTestList
    toProtoTestList(i, test_list, doing_completions)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 45, in toProtoTestList
    toProtoTestList(i, test_list, doing_completions)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/loader.py", line 42, in toProtoTestList
    test_list.append(proto_test(suite))
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/result.py", line 27, in proto_test
    return ProtoTest(test)
  File "/home/franco/.virtualenvs/verde/local/lib/python2.7/site-packages/green/result.py", line 57, in __init__
    for line in test._testMethodDoc.lstrip().split('\n'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

A possible workaround is to decode test._testMethodDoc with 'utf8':

            if getattr(test, "_testMethodDoc", None):
                for line in test._testMethodDoc.decode('utf8').lstrip().split('\n'):
                    line = line.strip()
                    if not line:
                        break
                    doc_segments.append(line)
            self.docstr_part = ' '.join(doc_segments)

Now it works:

.

Ran 1 test in 0.108s

OK (passes=1)

But I don't know if this solution is overall acceptable.

CleanCut commented 8 years ago

I looked into this. I couldn't find a fix for green that wouldn't break something under some other condition or in some other version of Python.

BUT I did find a good workaround for you: Start your docstring with u""" -- that causes the docstring to be correctly parsed in all versions of Python.

# -*- coding: utf-8 -*-

import unittest

class TestGrammar(unittest.TestCase):

    def test_one(self):
        u"""
        Ésto va a fallar.
        """
        one = 1
        two = 1
        self.assertEqual(one, two)

if __name__ == '__main__':
    unittest.main()

jayvdb commented 8 years ago

The 'correct'/other solution is to use from __future__ import unicode_literals

CleanCut commented 8 years ago

@jayvdb That's a much nicer workaround. Thanks!

CleanCut / green

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 35: ordinal not in range(128) #102