forTEXT / catma

Computer Assisted Text Markup and Analysis
https://www.catma.de
GNU General Public License v3.0
89 stars 8 forks source link

UTF-8 plain text Source Docs including HTML encoded special characters like umlauts break the Tagger #116

Open mpetris opened 9 years ago

mpetris commented 9 years ago

At least texts that include decimal encoded characters like ê for ê break the Tagger and show strange effects during tagging.

mpetris commented 9 years ago

The text needs to be sanitized to prevent HTML interpretation. That's also a security issue.