As study files usually are submitted into various encoding, the study parser will regularly fail with encoding errors of type:
Traceback (most recent call last):
File "pyidr/study_parser.py", line 659, in <module>
parser = main(sys.argv[1:])
File "pyidr/study_parser.py", line 632, in main
p = StudyParser(s)
File "pyidr/study_parser.py", line 113, in __init__
self._study_lines = f.readlines()
File "/Users/sbesson/anaconda3/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa8 in position 7339: invalid start byte
One approach is to unify and force the encoding of study files. This PR explores the alternative approach and makes the study_parser more lenient by forcing an UTF-8 encoding but ignoring errors.
As study files usually are submitted into various encoding, the study parser will regularly fail with encoding errors of type:
One approach is to unify and force the encoding of study files. This PR explores the alternative approach and makes the study_parser more lenient by forcing an UTF-8 encoding but ignoring errors.
Tested with
idr0072
study file