converter = TranskribusToPrima(ET.parse(input_file), prefer_imgurl)
File "src/lxml/etree.pyx", line 3521, in lxml.etree.parse
File "src/lxml/parser.pxi", line 1880, in lxml.etree._parseDocument
File "src/lxml/parser.pxi", line 1900, in lxml.etree._parseFilelikeDocument
File "src/lxml/parser.pxi", line 1795, in lxml.etree._parseDocFromFilelike
File "src/lxml/parser.pxi", line 1201, in lxml.etree._BaseParser._parseDocFromFilelike
File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 721, in lxml.etree._handleParseResult
File "src/lxml/etree.pyx", line 318, in lxml.etree._ExceptionContext._raise_if_stored
File "src/lxml/parser.pxi", line 370, in lxml.etree._FileReaderContext.copyToBuffer
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 5: invalid continuation byte
In this file, I get: