Open iurshina opened 2 years ago
maybe it could be implemented as a regular expression which is passed as a command line option
docker run -v $(pwd):/app excite_toolchain segmentation --repeated-author-symbol="ders[.]?|[-]{2,}"
This would make it possible to configure the list in the Web UI frontend, depending on language.
Add a rule for "use the last author name previously recognized" when "ders." is encountered