dkpro / dkpro-cassis

UIMA CAS processing library written in Python
https://pypi.org/project/dkpro-cassis/
Apache License 2.0
85 stars 22 forks source link

Reading xmi then writing again results in broken xmi file. #64

Closed HerrKrishna closed 5 years ago

HerrKrishna commented 5 years ago

If I load the attached xmi via load_cas_from_xmi and then use to_xmi without changing anything in the cas, INCePTION can not read the new xmi file.

Bundestag_08-7.zip

To Reproduce Steps to reproduce the behavior:

with open('TypeSystem.xml', 'rb') as f:
    typesystem = load_typesystem(f)

with open('Bundestag_08-7.txt.xmi', 'rb') as f:
    cas = load_cas_from_xmi(f, typesystem=typesystem)

cas.to_xmi('test_cas.xmi', pretty_print=True)

Then import test_cas.xmi into INCEpTION

Expected behavior I would expect Inception to import the resulting xmi file, just as it did with the original xmi file.

Error message Error while uploading document test_cas.xmi: XCASParsingException: Error parsing XMI-CAS from source at line -1, column -1: xmi id 1459 is referenced but not defined.

Please complete the following information:

jcklie commented 5 years ago

Thank you for using cassis and the error report! I am having a look right now what is going wrong there.

HerrKrishna commented 5 years ago

The issue worked itself out, by ugrading to Inception version 0.11. I used version 0.9.1 before. Sorry for bothering.

reckart commented 5 years ago

Hm. The new INCEpTION version imports XMIs leniently again - meaning that it silently discards data it cannot read. The bug in cassis exists nevertheless I would say, right @jcklie ?

jcklie commented 5 years ago

It is a real bug in cassis and I fixed it just now.

jcklie commented 5 years ago

I released 0.2.1, it hopefully should be fixed there.