ImportErrors caused by a line containing unicode text cause UnicodeError when attempting to log trace

ayrtonmassey commented 9 years ago

I'm using green 2.0.1 in Python 2.7.6. I had a bug in a line of my code that contained unicode characters. When I attempted to run green, I got the following error:

Traceback (most recent call last):                                                                                                                                                                                                                                                
  File "/home/ayrton/.venv/whosays/bin/green", line 11, in <module>                                                                                                                                                                                                               
    sys.exit(main())                                                                                                                                                                                                                                                              
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/cmdline.py", line 66, in main                                                                                                                                                                          
    test_suite = loadTargets(args.targets, file_pattern = args.file_pattern)                                                                                                                                                                                                      
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 298, in loadTargets                                                                                                                                                                   
    suite = loadTarget(target, file_pattern)                                                                                                                                                                                                                                      
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 353, in loadTarget                                                                                                                                                                    
    tests = discover(candidate, file_pattern=file_pattern)                                                                                                                                                                                                                        
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 266, in discover                                                                                                                                                                      
    subdir_suite = discover(path, file_pattern=file_pattern)                                                                                                                                                                                                                      
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 266, in discover                                                                                                                                                                      
    subdir_suite = discover(path, file_pattern=file_pattern)                                                                                                                                                                                                                      
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 266, in discover                                                                                                                                                                      
    subdir_suite = discover(path, file_pattern=file_pattern)                                                                                                                                                                                                                      
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 278, in discover                                                                                                                                                                      
    module_suite = loadFromModuleFilename(path)                                                                                                                                                                                                                                   
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/loader.py", line 230, in loadFromModuleFilename                                                                                                                                                        
    dotted_module, filename, traceback.format_exc())                                                                                                                                                                                                                              
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 504: ordinal not in range(128)

The offending lines are 229 & 230 in green/loader.py:

        message = ('Failed to import {} computed from filename {}\n{}').format(
                       dotted_module, filename, traceback.format_exc())

It seems that traceback.format_exc() cannot process unicode characters. Removing traceback.format_exc() fixes the problem but obviously removes the traceback of the exception, which is not desirable.

CleanCut commented 9 years ago

That's an interesting one! Could you provide me a minimal test case that I could use to produce the crash? I don't know how I could reproduce that.

ayrtonmassey commented 9 years ago

It seems to only be a problem if you use the actual characters (e.g. ” (RIGHT DOUBLE QUOTATION MARK) instead of \u201d). See the following example:

module.py:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import re

greeting = u'Hello'
PATTERN = re.compile(u'“{greeting} [A-Z][a-z]+”'.format(greeting=grtng)) # 'grtng' should be 'greeting' - will throw a 'not defined' error

def foo():
    string = '“Hello World!”'
    if re.match(PATTERN,string):
        return True

test.py:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import unittest
from module import foo

class Test(unittest.TestCase):

    def test_function(self):
        assert foo()

What happens is that I attempt to import module, but an error occurs when importing. An exception is thrown but when green tries to log it, traceback fails to format the exception because of the unicode characters and thus throws a UnicodeDecodeError.

A similar thing happens when an exception is thrown (e.g. AssertionError) but the line includes unicode characters:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import unittest

class Test(unittest.TestCase):

    def test_function_with_bug(self):
        assert '“Hello World!”' == False

An AssertionError will be thrown on assert '“Hello World!”' == False but instead of displaying the stack trace, the following exception is thrown:

Failure in test.Test.test_function_with_bug                                                                                                                                                                                                                                       
Traceback (most recent call last):                                                                                                                                                                                                                                                
  File "/home/ayrton/.venv/whosays/bin/green", line 11, in <module>                                                                                                                                                                                                               
    sys.exit(main())                                                                                                                                                                                                                                                              
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/cmdline.py", line 79, in main                                                                                                                                                                          
    result = run(test_suite, stream, args) # pragma: no cover                                                                                                                                                                                                                     
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/runner.py", line 130, in run                                                                                                                                                                           
    result.stopTestRun()                                                                                                                                                                                                                                                          
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/result.py", line 355, in stopTestRun                                                                                                                                                                   
    self.printErrors()                                                                                                                                                                                                                                                            
  File "/home/ayrton/.venv/whosays/local/lib/python2.7/site-packages/green/result.py", line 563, in printErrors                                                                                                                                                                   
    + "\n{}".format(self.colors.yellow(frame)), level = 3)                                                                                                                                                                                                                        
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 97: ordinal not in range(128)

I imagine the solution is "don't use unicode character literals in your code", but it's worth noting nonetheless.

CleanCut commented 9 years ago

Well, essentially you're right. Python 2 just expects strings a lot. Even if the file is correctly marked as being in unicode encoding, the traceback module doesn't care and expects everything to be an ascii string.

I haven't been able to figure out a nice way to dig out the actual exception (I'm sure there is a way, I just don't feel like spending the 4 hours it will take to find it on such a rare corner case). I have patched it so that at least it catches the UnicodeDecodeError and tells you the module name that it crashed trying to import and that the traceback couldn't be displayed because of unicode literals. That should at least get people pointed in the right direction.

CleanCut commented 9 years ago

Oh, and the fix will be in the 2.0.3 release.

wonderb0lt commented 7 years ago

Hi @CleanCut, I'm still having the problem as described here.

This is my test subject:

#coding=utf-8
def utf8_exception():
    raise('Das Böse ist immer und überall')

And this is my test runner

import unittest
import subject

class SomeTest(unittest.TestCase):
    def test_subject(self):
        subject.utf8_exception()
        self.assertEqual(1, 2)

And running everything results in:

$ green -vvv
Green 2.7.2, Coverage 4.3.4, Python 2.7.12

test
  SomeTest
E   test_subject

Error in test.SomeTest.test_subject
Traceback (most recent call last):
  File "/home/wonderb0lt/pyenvs/iotkoffer/bin/green", line 11, in <module>
    sys.exit(main())
  File "/home/wonderb0lt/pyenvs/iotkoffer/local/lib/python2.7/site-packages/green/cmdline.py", line 75, in main
    result = run(test_suite, stream, args, testing)
  File "/home/wonderb0lt/pyenvs/iotkoffer/local/lib/python2.7/site-packages/green/runner.py", line 130, in run
    result.stopTestRun()
  File "/home/wonderb0lt/pyenvs/iotkoffer/local/lib/python2.7/site-packages/green/result.py", line 353, in stopTestRun
    self.printErrors()
  File "/home/wonderb0lt/pyenvs/iotkoffer/local/lib/python2.7/site-packages/green/result.py", line 571, in printErrors
    + "\n{}".format(self.colors.yellow(frame)), level=3)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 111: ordinal not in range(128)

MinchinWeb commented 7 years ago

@CleanCut : I while back I added Unidecode, but only activated it on Windows. Should we activate it on Python 2 as well? Or as a command-line switch?

(relevant code here)

CleanCut commented 7 years ago

@MinchinWeb I couldn't find a way to fix this problem with unidecode. I found a way to fix it with some manual decoding, though. If you can find a cleaner way with unidecode, go ahead and change it. You can use the green.test.test_result.TestGreenTestResult.test_printErrors_Py2Unicode test to duplicate @wonderb0lt's error condition.

CleanCut commented 7 years ago

@wonderb0lt The bug is fixed in 2.7.3 (just released).

CleanCut / green

ImportErrors caused by a line containing unicode text cause UnicodeError when attempting to log trace #77