inception-project / inception

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.
https://inception-project.github.io
Apache License 2.0
593 stars 151 forks source link

External recommender fails when CAS contains control characters #1511

Closed jcklie closed 1 year ago

jcklie commented 4 years ago

Describe the bug External recommender fails when CAS contains control characters.

To Reproduce Steps to reproduce the behavior:

  1. Create a document with content 第四卷第一四二八页。  �
  2. Configure an external recommender for that
  3. See error

Expected behavior No error and nice predictions.

Screenshots `Caused by: org.xml.sax.SAXParseException: Trying to serialize non-XML 1.0 character: 0x14 at offset 975 in string starting with 毛主席语录

Fix

Do the same as in https://github.com/dkpro/dkpro-core/pull/1426

jcklie commented 4 years ago

@reckart I think that we fixed that in dkpro

reckart commented 4 years ago

I added support for XML 1.1 to the DKPro Core XmiWriter, but I don't think we use the XmiWriter in the external recommender. I expect we use similar code and the code in the external recommender should be adjusted in the same way as the DKPro Core code.