Closed rasoolims closed 7 years ago
Thanks for catching that @rasoolims. I have replaced the few '<<' and '>>' characters with ". I have also found some stray HTML markup at the end of some verses which I have removed. Do you mind adding your script in my Corpus Tools package? Right now I only have Java scripts and I know that a lot of researchers are more familiar with Python.
Hi,
I created a simple script to create all pairs of aligned files (for Giza++ and other aligners). It seems that the Amharic file has problems (illegal XML characters in text).
Thanks