I'm running into an error when I try to generate the sentence-level parse for an Arabic document. Here's the error, and I'm attaching the document:
Generate sentence xml file...
/Users/ahalterman/MIT/NSF_RIDIR/UniversalPetrarch/UniversalPetrarch/data/text/syria_xml_1.xml
Traceback (most recent call last):
File "preprocess_doc.py", line 166, in <module>
read_doc_input(inputxml, inputparsed, outputfile)
File "preprocess_doc.py", line 110, in read_doc_input
if doc.encode('UTF-8').find(line) ==-1:
TypeError: a bytes-like object is required, not 'str'
This seems to be a Python 2/3 error. Hardcoding python2 as the Python calls in preprocess_doc.sh fixed the problem. Closing, but should be addressed in #18.
I'm running into an error when I try to generate the sentence-level parse for an Arabic document. Here's the error, and I'm attaching the document:
syria_xml_1.txt