dkpro / dkpro-cassis

UIMA CAS processing library written in Python
https://pypi.org/project/dkpro-cassis/
Apache License 2.0
85 stars 22 forks source link

FSes that are only transitively referenced cannot be serialized #174

Closed reckart closed 3 years ago

reckart commented 3 years ago

Describe the bug If a feature structure is only transitively referenced by another feature structure and never itself added to the CAS, then serialization fails.

To Reproduce

def test_serializing_only_transitively_referenced_feature_structures():
    typesystem = TypeSystem()
    cas = Cas(typesystem)
    FooType = typesystem.create_type("foo.test.Foo")
    typesystem.add_feature(FooType, "bar", "bar.test.Bar")
    BarType = typesystem.create_type("bar.test.Bar")

    # Check that two annotations of the same type get the same namespace
    foo = FooType()
    cas.add_annotation(foo)
    foo.bar = BarType()
    actual_xmi = cas.to_xmi()

Expected behavior Serialization should not fail

Error message

self = <cassis.xmi.CasXmiSerializer object at 0x10cd7d610>
sink = <_io.BytesIO object at 0x10cda3950>
cas = <cassis.cas.Cas object at 0x10cd7d580>, pretty_print = False

    def serialize(self, sink: Union[IO, str], cas: Cas, pretty_print=True):
        xmi_attrs = {"{http://www.omg.org/XMI}version": "2.0"}

        root = etree.Element(etree.QName(self._nsmap["xmi"], "XMI"), nsmap=self._nsmap, **xmi_attrs)

        self._serialize_cas_null(root)

        # Find all fs, even the ones that are not directly added to a sofa
>       for fs in sorted(cas._find_all_fs(), key=lambda a: a.xmiID):
E       TypeError: '<' not supported between instances of 'NoneType' and 'int'

../cassis/xmi.py:308: TypeError

Please complete the following information:

reckart commented 3 years ago

Maybe add a parameter to _find_all_fs() which when set to True would generate missing xmiIDs.