Closed dmitriydligach closed 4 years ago
From the code, it looks like the entry references another entry which is not defined. Can you look in your data and tell me what entry has ID 0? I would guess it is <cas:NULL xmi:id="0"/>
.
Ok, I do see this in my XMI file:
xmi:version="2.0">
but also:
<textsem:PersonTitleAnnotation xmi:id="25240" sofa="8" begin="163" end="167" id="0" typeID="0" discoveryTechnique="0" ...
<textsem:DateAnnotation xmi:id="25492" sofa="8" begin="2482" end="2486" id="0" typeID="0" discoveryTechnique="0" ...
and others.
Does this help?
Are all these from the same file??
Ah - some have an „id“ attribute with value 0 but the „xmi:id“ is always different.
@dmitriydligach Do you have a href="#0"
somewhere in your XMI?
@reckart Sorry, to clarify, there's only one entry with xmi:id="0". It's this one:
xmlns:type2="http:///org/apache/ctakes/constituency/parser/uima/type.ecore" xmi:version="2.0">
@jcklie I did not find href="#0" in this XMI file.
It is really difficult to guess what the error could be. Can you put a print statement before the line where it fails and tell me which annotation type it is that breaks, maybe even post this annotation?
@jcklie Sure, I added a few print statements:
# Resolve references
if typesystem.is_collection(fs.type, feature):
# A collection of references is a list of integers separated
# by sin`gle spaces, e.g. <foo:bar elements="1 2 3 42" />
targets = []
for ref in value.split():
target_id = int(ref)
if target_id == 0:
print('target_id:', target_id)
print('value:', value)
print('fs:', fs)
print('feature_name:', feature_name)
print('fs.type:', fs.type)
print('feature:', feature)
# print('feature_structures:', feature_structures)
Which print the following right before it crashes:
target_id: 0 value: 0 9911 fs: org_apache_ctakes_typesystem_type_relation_CollectionTextRelation(xmiID=12107, members='0 9911', id='0', category=None, discoveryTechnique='0', confidence='0.0', polarity='0', uncertainty='0', conditional='false', type='org.apache.ctakes.typesystem.type.relation.CollectionTextRelation') feature_name: members fs.type: org.apache.ctakes.typesystem.type.relation.CollectionTextRelation feature: Feature(name='members', rangeTypeName='uima.cas.FSList', description='A super-type for relationships between multiple spans of text.', elementType='org.apache.ctakes.typesystem.type.relation.RelationArgument', multipleReferencesAllowed=None, _has_reserved_name=False)
So, I think I found the corresponding annotation from the XMI file:
<relation:CollectionTextRelation xmi:id="12107" id="0" discoveryTechnique="0" confidence="0.0" polarity="0" uncertainty="0" conditional="false" members="0 9911"/>
Does this help?
Aside from the bug in cassis.... - if I see this right, then the relation has a list feature members
with a null
value. The latter triggers the reference to xmi:id=0
(the null value feature structure). So maybe your software also has a bug in the first place that this null
reference shouldn't even be there (i.e. it should maybe be members="9911"
)?.
@reckart Thanks for pointing it out. Most likely this is not an issue with our software -- instead it might be an annotation error (we have a reader that populates these things in the CAS).
@dmitriydligach I pushed a fix. Can you check whether it works for you in master?
@jcklie It worked! Thank you so much for addressing this issue so quickly. I will close it.
I released 0.2.8
I have several hundred XMI files that I'd like to interact with using CASSIS. I successfully was able to read 50-60 of them (thank you for addressing the issues I recently pointed out!). However, one XMI files causes a problem. Unfortunately, I am not able to give you access to this file, but perhaps you have some ideas what the problem might be from looking at the error?
The code is roughly this:
Here's the error: