Closed marcverhagen closed 1 year ago
For the second point, a new method get_alignments
in mmif-python 0.2.2
can be helpful.
This must have been fixed via #9 and #10. Or with the new work started by Hayden, this issue might be no longer relevant. @marcverhagen can you confirm?
Maybe, I did notice a comment saying that I did not use get_alignments
because it caused errors. SO I need to revisit this.
Yes, this can be closed. The some-things-remaining-to-be-done were done.
Some things remain to be done:
Harder is how we want to deal with the named entities from the set of TextDocument created by Tesseract and that the NER then ran over. My first hunch is to take all those documents and glue them together. I would like to change the module so it is a bit more view-centric.