blackk-foxx / apcsa

0 stars 1 forks source link

Non-printable characters in the program text cause the script to crash #16

Open blackk-foxx opened 5 years ago

blackk-foxx commented 5 years ago

The following attached file caused the error: Pokemon.java.txt

Probably need to validate each file before calling self.workbook.close.

Here's the stack trace:

Traceback (most recent call last): File "RunGoogleClassroom.py", line 248, in main() File "RunGoogleClassroom.py", line 186, in main writer.Close() File "/Users/todd/Documents/TEALS/git/apcsa/ExcelWriter.py", line 77, in Close self.workbook.close() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/workbook.py", line 306, in close self._store_workbook() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/workbook.py", line 649, in _store_workbook xml_files = packager._create_package() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/packager.py", line 140, in _create_package self._write_shared_strings_file() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/packager.py", line 287, in _write_shared_strings_file sst._assemble_xml_file() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/sharedstrings.py", line 54, in _assemble_xml_file self._write_sst_strings() File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/sharedstrings.py", line 84, in _write_sst_strings self._write_si(string) File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/sharedstrings.py", line 122, in _write_si self._xml_si_element(string, attributes) File "/Users/todd/Library/Python/2.7/lib/python/site-packages/xlsxwriter/xmlwriter.py", line 122, in _xml_si_element self.fh.write("""<t%s>%s""" % (attr, string)) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 708, in write return self.writer.write(data) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 369, in write data, consumed = self.encode(object, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 14: ordinal not in range(128)

Smattr commented 5 years ago

This initially confused me, but when I looked at the reproducer attached this made total sense and actually explained something that stumped me during class. The problem is that the file contains some character called "no-break space", 0xa0. Specifically, the line:

\t\tScanner\xa0userInput\xa0=\xa0new\xa0Scanner(System.in);

In class I helped a student who had a line that looked correct to me, but was highlighted as wrong. Eclipse's error was unhelpful and after retyping the line exactly as written everything worked, so I filed it in the back of my head as a heisenbug and went on with my day. The line was where they were constructing the scanner and it was probably exactly the line above. I don't see how a student would have unknowingly typed this character, so I wonder if our training materials have this character somewhere.

Note, both standalone javac and Eclipse reject these characters as invalid white space, so whoever uploaded this submission apparently did not compile their code.


For reference, I extracted the above line by reading the file in binary:

with open('Pokemon.java.txt', 'rb') as f:
  for line in f.read().split(b'\n'):
    print(line)