SunoikisisDC / SunoikisisDC-2021-2022

11 stars 5 forks source link

Problem with treebanking in Greek #12

Closed ZizhaoXU closed 2 years ago

ZizhaoXU commented 3 years ago

I have a small technical issue with treebanking in Greek. In the second sentence of our set text, there is a phrase of ὑπ' αὐτῶν. When I put the text into the treebanking tool, even if I have ensured that it was pasted as plain text and no space between ὑπ' , the tool always detects it as separate and marks this [ ' ]as a punctuation, which is meaningless and I need to adjust the xml file by myself to fix it. I am wondering if anything is wrong here.

gabrielbodard commented 3 years ago

No, it's just that the parser that turns your text into an XML file ready for treebanking assumes that ' is a punctuation character (as it would be in some other languages), and so erroneously tags it as such. You can either ignore it in the tree, or if you feel bold, you can edit the underlying XML yourself, as you spotted!