dkpro / dkpro-cassis

UIMA CAS processing library written in Python
https://pypi.org/project/dkpro-cassis/
Apache License 2.0
85 stars 22 forks source link

UIMA allows different numbers of sofa and views but cassis does not #110

Closed jcklie closed 4 years ago

jcklie commented 4 years ago

Describe the bug

Looks like the UIMA org.apache.uima.cas.impl.XmiCasSerializer.XmiDocSerializer.writeView(Sofa, Collection<TOP>) method only creates the view element in the XMI if there are annotations pointing to that Sofa (i.e. membersString.length() > 0):

        if (membersString.length() > 0) {
          XmlElementName elemName = uimaTypeName2XmiElementName("uima.cas.View");
          startElement(elemName, workAttrs, 0);
          endElement(elemName);
        }

To Reproduce Use CAS and type system from here.

  ts_file = open('TypeSystem.xml', 'rb')
  type_system = load_typesystem(ts_file)

  xmi_file = open('XmiMultViews/patientX_doc1_RAD.xmi', 'rb')

I get this error:

Traceback (most recent call last):
  File "./cas_mult_views.py", line 12, in <module>
    cas = load_cas_from_xmi(xmi_file, typesystem=type_system)
  File "/usr/local/lib/python3.6/site-packages/cassis/xmi.py", line 40, in load_cas_from_xmi
    return deserializer.deserialize(source, typesystem=typesystem)
  File "/usr/local/lib/python3.6/site-packages/cassis/xmi.py", line 131, in deserialize
    raise RuntimeError("Number of views and sofas is not equal!")
RuntimeError: Number of views and sofas is not equal!

Expected behavior Create an empty view if it is missing in the XMI