Closed Chenger1 closed 3 years ago
Fixed using 'windows-1252' encoding. But now code encode strings to 'utf-8' and decode it again to return full 'utf-8' string.
text = list(map(lambda line: line.decode('utf-8'), [line.encode('utf-8') for line in file]))
This seems to have much overhead. But it needs to be checked after all refactor processes will be done.
After reading info about string encoding i learned that 'decoding' function both utf-8 and windows-1252 returns Unicode which we can use. So we don`t need recoding anymore
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 1406: invalid start byte