Robpol86 / terminaltables

Project no longer maintained.
https://github.com/matthewdeanmartin/terminaltables
MIT License
689 stars 70 forks source link

encoding error #39

Closed 0x3h closed 7 years ago

0x3h commented 7 years ago

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 0: unexpected end of data

arr = []
arr.append([1, 2, "\xe9"])
table_data = [['a', 'b', 'c']]
for item in arr:
    table_data.append(item)
table = AsciiTable(table_data)
print "\n%s" % table.table

I suggest this edit string.decode('u8', 'ignore') on _width_andalignment.py#L28 As a workaround doing it explicitly eg: "\xe9".decode('u8', 'ignore') also does the job.

Robpol86 commented 7 years ago

"\xe9".decode('u8', 'ignore') returns u'' which is a length of 0 instead of 1. That would lead to broken/misaligned tables.

0x3h commented 7 years ago

You can calculate non-utf8 chars length before decoding then later add the missing chars as ? or spaces.

Robpol86 commented 7 years ago

I think this is out of scope for terminaltables. The exception is "unexpected end of data" which means it's a bad unicode string being fed to terminaltables (works on python3.5 but not python2.7, 3.5 probably handles bad unicode better).

I think the proper solution here is to have the caller handle bad unicode data and correct it before trying to use table.table in a try/except block.